iTheoretical ;>la . , _.,. __ ; ,_-_ INTERNATIONAL SERIES IN PHYSICS LEE A. DuBRIDGE, Consulting Editor INTRODUCTION TO THEORETICAL PHYSICS The quality of the materials used in the manufacture of thin book is governed by nnn tinned postwar shortages. INTERNATIONAL SERIES IN PHYSICS LEE A. DuBRIDGE, Consulting Editor Backer and Goudsmitâ€” Atomic Energy States Bitter â€” Introduction to Ferromacnetism Clark â€” Applied X-rays Condon and Morse â€” Quantum Mechanics Curtisâ€” Electrical Measurements Davey â€” Crystal Structure and Its Applications Edwards â€” Analytic and Vector Mechanics Eldridge â€” The Physical Basis op Things Hardy and Perrin â€” The Principles of Optics Harnwell â€” Principles of Electricity and Electro- magnetism Harnwell and Livingood â€” Experimental Atomic Physics Houston- â€” Principles of Mathematical Physics Hughes and DuBridge â€” Photoelectric Phenomena Hund â€” -High-frequency Measurements Hund â€” Phenomena in High-frequency Systems Kemble â€” The Fundamental Principles of Quantum Mechanics Kennard â€” Kinetic Theory of Gases Roller- â€” The Physics of Electron Tubes Morse â€” Vibration and Sound Musical â€” The Flow of Homogeneous Fluids through Porous Media Pauling and Goudsmit â€” The Structure of Line Spectra Richtmyer and Kennard â€” Introduction to Modern Physics Ruarh and Urey â€” Atoms, Molecules and Quanta Seitz â€” The Modern Theory of Solids Slater â€” Introduction to Chemical Physics Slater â€” Microwave Transmission Slater and Frank â€” Introduction to Theoretical Physics Smythe â€” Static and Dynamic Electricity Stratton â€” Electromagnetic Theory Whiteâ€” Introduction to Atomic Spectra Williams â€” Magnetic Phenomena Dr. F. K. Richtmyer was consulting editor of the aeries from its inception in 1029 until his death in 1939. INTRODUCTION TO THEORETICAL PHYSICS BY JOHN C, SLATER, Ph.D. Professor of Physics, Massachusetts Institute of Technology AND NATHANIEL H. FRANK, Sc.D. Assistant Professor of Physics, Massachusetts Institute of Technology First Edition Tenth Impression McGRAW-HILL BOOK COMPANY, Inc. NEW YORK AND LONDON 1933 CoPYHIGHT, 1933, BY THB McGkaw-Hili, Book Company, Inc. PHINTED IN THE UNITED STATES OF AMEBIC A All rights reserved. This book, or parts thereof, may not be reproduced in any form without permission of the publishers. CITY Of UVERPOOt COUKE OF TECHNOLOGY UBRARV PREFACE The general plan of a book is often clearer if one knows how it came to be written. This book started from two separate sources. First, it originated in a year's lecture course of the same title, covering about the first two-thirds of the ground pre- sented here, the part on classical physics. This course grew out of the conviction that the teaching of theoretical physics in a number of separate courses, as in mechanics, electromagnetic theory, potential theory, thermodynamics, tends to keep a student from seeing the unity of physics, and from appreciating the importance of applying principles developed for one branch of science to the problems of another. The second source of this book was a projected volume on the structure of matter, dealing principally with applications of modern atomic theory to the structure of atoms, molecules, and solids, and to chemical problems. As work progressed on this, it became evident that the structure of matter could not be treated without a thorough understanding of the principles of wave mechanics, and that such an understanding demanded a careful grounding in classical physics, in mechanics, wave motion, the theory of vibrating systems, potential theory, statistical mechanics, where many principles needed in the quantum theory are best introduced. The ideal solution seemed to be to combine the two projects, including the classical and the more modern parts of theoretical physics in a coherent whole, thus further increasing the unity of treatment of which we have spoken. Two general principles have determined the order of presenting the material: mathematical difficulty, and order of historical development. Mechanics and problems of oscillations, involving ordinary differential equations and simple vector analysis, come first. Then follow vibrations and wave motion, intro- ducing partial differential equations which can be solved by separation of variables, and Fourier series. Hydrodynamics, electromagnetic theory, and optics bring in more general partial differential equations, potential theory, and differential vector operations. Wave mechanics uses almost all the mathematical machinery which has been developed in the earlier part of the book. It is natural that the historical order is in general tho Vi PREFACE same as the order of increasing mathematical difficulty, for each branch of physics as it develops builds on the foundation of everything that has gone before. In cases where the two arrangements do not coincide, we have grouped together subjects of mathematical similarity, thus emphasizing the unity of which we have spoken. In a book of such wide scope, it is inevitable that many impor- tant subjects are treated in a cursory manner. An effort has been made to present enough of the groundwork of each subject so that not only is further work facilitated, but also the position of these subjects in a more general scheme of physical thought is clearly shown. In spite of this, however, the student will of course make much use of other references, and we give a list of references, by no means exhaustive, but suggesting a few titles in each field which a student who has mastered the material of this book should be able to appreciate. At the end of each chapter is a set of problems. The ability to work problems, in our opinion, is essential to a proper under- standing of physics, and it is hoped that these problems will provide useful practice. At the same time, in many cases, the problems have been used to extend and amplify the discussion of the subject matter, where limitations of space made such dis- cussion impossible in the text. The attempt has been made, though we are conscious of having fallen far short of succeeding in it, to carry each branch of the subject far enough so that definite calculations can be made with it. Thus a far surer mastery is attained than in a merely descriptive discussion. Finally, we wish to remind the reader that the book is very definitely one on theoretical physics. Though at times descrip- tive material, and descriptions of experimental results, are included, it is in general assumed that the reader has a fair knowledge of experimental physics, of the grade generally covered in intermediate college courses. No doubt it is unfortunate, in view of the unity which we have stressed,, to separate the theoret- ical side of the subject from the experimental in this way. This is particularly true when one remembers that the greatest diffi- culty winch the student has in mastering theoretical physics comes in learning how to apply mathematics to a physical situa- tion, how to formulate a problem mathematically, rather than in solving the problem when it is once formulated. We have tried wherever possible, in problems and text, to bridge the gap PREFACE yii between pure mathematics and experimental physics. But the only satisfactory answer to this difficulty is a broad training in which theoretical physics goes side by side with experimental physics and practical laboratory work. The same ability to overcome obstacles, the same ingenuity in devising one method of procedure when another fails, the same physical intuition leading one to perceive the answer to a problem through a mass of intervening detail, the same critical judgment leading one to distinguish right from wrong procedures, and to appraise results carefully on the ground of physical plausibility, are required in theoretical and in experimental physics. Leaks in vacuum sys- tems or in electric circuits have their counterparts in the many disastrous things that can happen to equations. And it is often as hard to devise a mathematical system to deal with a difficult problem, without unjustifiable approximations and impossible complications, as it is to design apparatus for measuring a diffi- cult quantity or detecting a new effect. These things cannot be taught. They come only from that combination of inherent insighl and faithful practice which is necessary to the successful physicist. But half the battle is over if the student approaches theoretical physics, not as a set of mysterious formulas, or as a dull routine to be learned, but as a collection of methods, of tools, of apparatus, subject to the same sort of rules as other physical apparatus, and yielding physical results of great importance. The title of this book might have been aptly extended to "Intro- duction to the Methods of Theoretical Physics," for the aim has constantly been, not to teach a great collection of facts, but to teach mastery of the tools by which the facts have been dis- covered and by which future discoveries will be made. In a subject about which so much has been written, it seems hardly practicable to acknowledge our indebtedness to any specific books. From many of those mentioned in the section on suggested references, and from many others, we have received ideas, though the material in general has been written without conscious following of earlier models. We wish to express thanks to several of our colleagues for suggestions, and partic- ularly to Professors P. M. Morse and J. A. Stratton, who have read the manuscript with much care and have contributed greatly by their discussions. Cambridge, Mass., J. C, S. September, 1933. N. H. F. CONTENTS Page Pkeface . V Chapter I POWER SERIES Introduction 1 1. Power Series 2 2. Small Quantities of Various Orders 3 3. Taylor's Expansion 4 4. The Binomial Theorem 4 6. Expansion about an Arbitrary Point 4 6. Expansion about a Pole 5 7. Convergence 5 Problems 8 Chapter II POWER SERIES METHOD FOR DIFFERENTIAL EQUATIONS Introduction, â– â€¢ 10 8. The Fallino Body. 11 9. Falling Body with Viscosity 11 10. Particular and General Solutions for Falling Body with Viscosity, , 14 11. Electric Circuit Containing Resistance and Inductance 16 Problems 17 Chapter III POWER SERIES AND EXPONENTIAL METHODS FOR SIMPLE HARMONIC VIBRATIONS Introduction. 19 12. Particle with Linear Restoring Force 19 13. Oscillating Electric Circuit 20 14. The Exponential Method of Solution 21 15. Complex Exponentials 22 16. Complex Numbers 23 17. Application of Complex Numbers to Vibration Problems 25 Problems 26 Chapter IV DAMPED VIBRATIONS, FORCED VIBRATIONS, AND RESONANCE Introduction 27 18. Damped Vibrational Motion 27 x CONTENTS Page 19. Damped Electrical Oscillations 28 20. Initial Conditions for Transients. 29 21. ['"uiu.'ED Vibrations and Resonance 29 22. Mechanical Resonance 30 23. Electrical Resonance 31 24. Superposition of Transient and Forced Motion 33 25. Motion under General External Forces 35 26. Generalizations Regarding Linear Differential Equations 36 Problems 37 Chapter V ENERGY Introduction 39 27. Mechanical Energy 40 28. Use of the Potential for Discussing the Motion of a System 42 29. The Rolling-ball Analogy 45 30. Motion in Several Dimensions. 46 Problems 46 Chapter VI VECTOR FORCES AND POTENTIALS Introduction. 48 31. Vectors and Their Components 48 32. Scalar Product of Two Vectors 49 33. Vector Product of Two Vectors â– â– 50 34. Vector Fields 51 35. The Energy Theorem in Three Dimensions 52 36. Line Integrals and Potential Energy 52 37. Force as Gradient of Potential 53 38. Equipotential Surfaces 54 39. The Curl and the Condition for a Conservative System 55 40. The Symbolic Vector V 55 Problems ^ Chapter VII LAGRANGE'S EQUATIONS AND PLANETARY MOTION Introduction 58 41. Lagrange's Equations 58 42. Planetary Motion **0 43. Energy Method for Radial Motion in Central Field ... 61 44. Orbits in Central Motion Â°2 45. Justification of Lagrange's Method 64 Problemb "' Chapter VIII GENERALIZED MOMENTA AND HAMILTON'S EQUATIONS Introduction 46. Generalized Forces. 69 69 CONTENTS xi Page 47. Generalized Momenta 70 48. Hamilton's Equations of Motion 71 49. General Proof of Hamilton's Equations 72 50. Example of Hamilton's Equations 74 51. Applications of Lagrange's and Hamilton's Equations ... 75 Problems. . . 76 Chapter IX PHASE SPACE AND THE GENERAL MOTION OF PARTICLES Introduction 79 52. The Phase Space 80 53. Phase Space for the Linear Oscillator 81 54. Phase Space for Central Motion 82 55. noncentkal two-dimensional motfon 83 56. Configuration Si'ace and Momentum Space 83 57. The Two-dimensional Oscillator 84 5S. Methods of Solution 86 59. Contact Transformations and Angle Variables 87 60. Methods of Solution for Nonperiodic Motions 90 Problems 90 Chapter X THE MOTION OF RIGID BODIES Introduction 92 61. Elementary Theory of Precessing Top 92 62. Angular Momentum, Moment of Inertia, and Kinetic Energy 94 63. The Ellipsoid of Inertia; Principal Axes of Inertia .... 95 64. The Equations of Motion 96 65. Euler's Equations 98 66. Torque-free Motion of a Symmetric Rigid Body 98 67. Euler's Ancles 100 68. General Motion of a Symmetrical Top under Gravity . . . 102 69. Precession and Nutation 104 Problems 105 Chapter XI COUPLED SYSTEMS AND NORMAL COORDINATES Introduction 107 70. Coupled Oscillators 107 71. Normal Coordinates Ill 72. Relation of Problem of Coupled Systems to Two-dimen- sional Oscillator 114 73. The General Problem of the Motion of Several Particles 117 Problems 118 Chapter XII THE VIBRATING STRING, AND FOURIER SERIES Introduction 120 74. Differential Equation of the Vibrating String 120 XU CONTENTS Page 75. The Initial Conditions for the String 122 76. Fourier Series 123 77. Coefficients of Fourier Series 124 78. Convergence of Fourier Series 125 79. Sine and Cosine Series, with Application to the String 126 80. The String as a Limiting Problem of Vibration of Particles 128 81. Lagrange's Equations for the Weighted String 131 82. Continuous String as Limiting Case 131 Problems 132 Chapter XIII NORMAL COORDINATES AND THE VIBRATING STRING Introduction 134 83. Normal Coordinates 134 84. Normal Coordinates and Function Space 137 85. Fourier Analysis in Function Space 139 86. Equations of Motion in Normal Coordinates 140 87. The Vibrating String with Friction 142 Problems 144 Chapter XIV THE STRING WITH VARIABLE TENSION AND DENSITY Introduction 146 88. Differential Equation for the Variable String 146 89. Approximate Solution for Slowly Changing Density and Tension 147 90. Progressive Waves and Standing Waves 149 91. Orthogonality of Normal Functions 151 92. Expansion of an Arbitrary Function Using Normal Func- tions. 152 93. Perturbation Theory 154 94. Reflection of Waves from a Discontinuity 156 Problems 158 Chapter XV THE VIBRATING MEMBRANE Introduction 160 95. Boundary Conditions on the Rectangular Membrane . . . 160 96. The Nodes in a Vibrating Membrane 162 97. Initial Conditions 162 98. The Method of Separation of Variables 163 99. The Circular Membrane 164 100. The Laplacian in Polar Coordinates. 164 101. Solution of the Differential Equation by Separation. . . 165 102. Boundary Conditions . 166 103. Physical Nature of the Solution 167 104. Initial Condition at t = 168 CONTENTS xiii Page 105. Proof of Orthogonality of the J's 169 Problems. . 170 Chapter XVI STRESSES, STRAINS, AND VIBRATIONS OF AN ELASTIC SOLID Introduction 172 106. Stresses, Body and Surface Forces 172 107. Examples of Stresses. 174 108. The Equation of Motion 175 109. Transversa Waves 176 110. Longitudinal Waves 178 111. General Wave Propagation 179 112. Strains and Hooke's Law 180 113. Young's Modulus 182 Problems 183 Chapter XVII FLOW OF FLUIDS Introduction 185 114. Velocity, Flux Density, and Lines of Flow 185 115. The Equation of Continuity 186 116. Gauss's Theorem 187 117. Lines of Flow to Measure Rate of Flow 188 118. Irrotational Flow and the Velocity Potential 188 119. Euler's Equations of Motion for Ideal Fluids 190 120. Irrotational Flow and Bernoulli's Equation 191 121. Viscous Fluids 192 122. Poiseuille's Law 194 Problems 195 Chapter XVIII HEAT FLOW Introduction 197 123. Differential Equation of Heat Flow 197 124. The Steady Flow of Heat 198 125. Flow Vectors in Generalized Coordinates 199 126. Gradient in Generalized Coordinates. 200 127. Divergence in Generalized Coordinates 200 128. Laplacian 201 129. Steady Flow of Heat in a Sphere 201 130. Spherical Harmonics 202 131. Fourier's Method for the Transient Flow of Heat .... 203 132. Integral Method for Heat Flow 205 Problems 209 Chapter XIX ELECTROSTATICS, GREEN'S THEOREM, AND POTENTIAL THEORY t ntroduction 210 133. The Divergence of the Field . . 210 x iv CONTENTS Page 134. The Potential , 211 135. Electrostatic Problems without Conductors 212 130, Electrostatic Problems with Conductors 215 137. G keen's Theorem -. â€¢ 217 138. Proof of Solution of Pqisson's Equation . . . â€¢ 217 139. Solution of Poisson's Equation in a Finite Region 220 140. Green's Distribution. 221 141. Green's Method of Solving Differential Equations. . . . 222 Problems 223 Chapter XX MAGNETIC FIELDS, STOKES'S THEOREM, AND VECTOR POTENTIAL Introduction â€¢ 225 142. The Magnetic Field of Currents 226 143. Field of a Straight Wire 228 144. Stokes's Theorem 229 145. The Curl in Curvilinear Coordinates 229 146. Applications of Stokes's Theorem 230 147. Example: Magnetic Field in a Solenoid 231 148. The Vector Potential 231 149. The Biot-Savart Law 233 Problems 234 Chapter XXI ELECTROMAGNETIC INDUCTION AND MAXWELL'S EQUATIONS Introduction 235 150. The Differential Equation for Electromagnetic Induction 235 151. The Displacement Current 236 152. Maxwell's Equations 239 153. The Vector and Scalar Potentials 241 Problems. 244 Chapter XXII ENERGY IN THE ELECTROMAGNETIC FIELD Introduction 246 154. Energy in a Condenser 246 155. Energy in the Electric Field 247 156. Energy in a Solenoid 248 157. Energy Density and Energy Flow 249 J 58. Poynteng's Theorem â€¢ â€¢ 250 159. The Nature of an E.M.F 250 160. Examples of Poynting's Vector 251 161. Energy in a Plane Wave . 253 162. Plank Waves in Metals 255 Problems 256 CONTENTS XV Chapter XXIII REFLECTION AND REFRACTION OF ELECTROMAGNETIC WAVES Paui , Introduction 258 103, Boundary Conditions at a Surface of Discontinuity. . . . 258 164. The Laws of Reflection and Refraction 259 165. Reflection Coefficient at Normal Incidence 260 166. Fresnel's Equations 202 167. The Pol arizing Angle 264 168. Total Reflection 265 169. The Optical Behavior of Metals 267 Problems 268 Chapter XXI V ELECTRON THEORY AND DISPERSION Introduction 270 170. Polarization and Dielectric Constant 271 171. The Relations of P, E, and D 273 172. PoLARlZABILITY AND DIELECTRIC CONSTANT OF GaSES 275 173. Dispersion in Gases 275 174. Dispersion of Solids and Liquids 278 175. Dispersion of Metals 280 Problems .- 283 Chapter XXV SPHERICAL ELECTROMAGNETIC WAVES Introduction. 286 176. Spherical Solutions of the Wave Equation 286 177. Scalar Potential for Oscillating Dipole 288 178. Vector Potential. 289 179. The Fields. 290 180. The Hertz Vector ,..:... 291 181. Intensity of Radiation from a Dipole 293 182. Scattering of Light 293 183. Polarization of Scattered Light 295 184. Coherence and Incoherence of Light 295 185. Coherence and the Spectrum 298 186. Coherence of Different Sources 299 Problems 299 Chapter XXVI HUYGENS' PRINCIPLE AND GREEN'S THEOREM Introduction 302 187. The Retarded Potentials 303 188. Mathematical Formulation of Huygens* Principle 305 189. Application to Optics 307 190. Integration for a Spherical Surface by Fresnel's Zones 308 XVI CONTENTS Page 191. The Use of Huygens' Principle 310 192. Huygens' Principle for Diffraction Problems 310 193. Qualitative Discussion of Diffraction, Using Fresnel's Zones 31 1 Problems 314 Chapter XXVII FRESNEL AND FRAUNHOFER DIFFRACTION Introduction 31 g 194. Comparison of Fresnel and Fraunhofer Diffraction. . . .315 195. Fresnel Diffraction from a Slit 319 196. Cornu's Spiral 320 197. Fraunhofer Diffraction from Rectangular Slit 323 198. The Circular Aperture 324 199. Resolving Power of a Lens 325 200. Diffraction from Several Slits; the Diffraction Grating 326 Problems 328 Chapter XXVIII WAVES, RAYS, AND WAVE MECHANICS Introduction 329 201. The Quantum Hypothesis 330 202. The Statistical Interpretation of Wave Theory 332 203. The Uncertainty Principle for Optics 333 204. Wave Mechanics 335 205. Frequency and Wave Length in Wave Mechanics 337 206. Wave Packets and the Uncertainty Principle 337 207. Fermat's Principle 339 208. The Motion of Particles and the Principle of Least Action 342 Problems. 343 Chapter XXIX SCHRODINGER'S EQUATION IN ONE DIMENSION Introduction. 345 209. Scitrodinger's Equation 345 210. One-dimensional Motion in Wave Mechanics 346 211. Boundary Conditions in One-dimensional Motion 350 212. The Penetration of Barriers 351 213. Motion in a Finite Region, and the Quantum Condition . . 353 214. Motion in Two or More Finite Regions 355 Problems 356 Chapter XXX THE CORRESPONDENCE PRINCIPLE AND STATISTICAL MECHANICS Introduction 358 215. The Quantum Condition in the Phase Space ....*.... 358 216. Angle Variables and the Correspondence Principle, . . . 359 CONTENTS xvii Paojo 217. The Quantum Condition for Several Degrees of Freedom 361 218. Classical Statistical Mechanics in the Phase Space .... 364 219. Liouvtlle's Theorem 365 220. Distributions Independent of Time 366 221. The Microcanonical Ensemble 367 222. The Canonical Ensemble 368 223. The Quantum Theory and the Phase Space 369 Problems. 371 Chapteh XXXI MATRICES Introduction. â€¢ 374 224. Mean Value of a Function of Coordinates 374 225. Physical Meaning of Matrix Components 375 226. Initial Conditions, and Determination of c's 377 227. Mean Values of Functions of Momenta 379 228. Schrodinger's Equation Including the Time 381 229. Some Theorems Regarding Matrices 382 Problems ... 384 Chapter XXXII PERTURBATION THEORY Introduction, 386 230. The Secular Equation of Pertdrbation Theory 386 231. The Power Series Solution 387 232. Perturbation Theory for Degenerate Systems 390 233. The Method of Variation of Constants 391 234. External Radiation Field 392 235. Einstein's Probability Coefficients 393 236. Method of Deriving the Probability Coefficients. .... 395 237. Application of Perturbation Theory 396 238. Spontaneous Radiation and Coupled Systems 399 239. Applications of Coupled Systems to Radioactivity and Electronic Collisions 402 Problems 404 Chapter XXXIII THE HYDROGEN ATOM AND THE CENTRAL FIELD Introduction. . . 406 240. The Atom and Its Nucleus 406 241. The Structure of Hydrogen 407 242. Discussion of the Function of r for Hydrogen 410 243. The Angular Momentum 414 244. Series and Selection Principles ( ... 416 245. TnE General Central Field 418 Problems 423 xviii CONTENTS Chapter XXXIV ATOMIC STRUCTURE Paqb Introduction. . 425 246. Tub Periodic Table 42f> 247. The Mktitod of Self-consistent Fields 430 248. Effective Nuclear Charges 431 249. The Many-body Problem in Wave Mechanics 432 250. SciirGdinger's Equation and Effective Nuclear Charges, . 433 251. Ionization Potentials and One-electron Energies 435 Problems 437 Chapter XXXV INTERATOMIC FORCES AND MOLECULAR STRUCTURE Introduction 439 252. Ionic Forces 439 253. Polarization Force. . 439 254. Van der Waals' Force 440 255. Penetration or Coulomb Force 442 256. Valence Attraction 442 257. Atomic Repulsions 444 258. Analytical Formulas for Valence and Repulsive Forces. . 444 259. Types of Substances: Valence Compounds 447 260. Metals 449 261. Ionic Compounds 449 Problems 451 Chapter XXXVI EQUATION OF STATE OF GASES Introduction. 454 262. Gases, Liquids, and Solids 454 263. The Canonical Ensemble 456 264. The Free Energy 458 265. Properties of Perfect Gases on Classical Theory 461 266. Properties of Imperfect Gases on Classical Theory . . . 462 267. Van der Waals' Equation 464 268. Quantum Statistics 466 269. Quantum Theory of the Perfect Gas 468 Problems 470 Chapter XXXVII NUCLEAR VIBRATIONS IN MOLECULES AND SOLIDS Introduction 471 270. The Crystal at Absolute Zero 472 271. Temperature Vibrations of a Crystal 474 272. Equation of State of Solids 478 273. Vibrations of Molecules 480 CONTENTS xix Page 274. Diatomic Molecules 481 275. Specific Heat of Diatomic Molecules 483 27G. Polyatomic Molecules ," 485 Problems â€¢ 486 Chapter XXXVIII COLLISIONS AND CHEMICAL REACTIONS Introduction 488 277. Chemical Reactions 488 278. Collisions with Electronic Excitation 491 279. Electronic and Nuclear Energy in Metals 494 280. Perturbation Method for Interaction of Nuclei 497 Problems 499 Chapter XXXIX ELECTRONIC INTERACTIONS Introduction â– SOI 281. The Exclusion Principle 502 282. Results of Antisymmetry of Wave Functions ....... 506 283. The Electron Spin 507 284. Electron Spins and Multiplicity of Levels 509 285. Multiplicity and the Exclusion Principle . 510 288. Spin Degeneracy for Two Electrons â€¢. .512 287. Effect of Exclusion Principle and Spin 514 Problems 516 Chapter XL ELECTRONIC ENERGY OF ATOMS AND MOLECULES Introduction 518 288. Atomic Energy Levels 518 289. Spin and Orbital Degeneracy in Atomic Multiplets .... 520 290. Energy Levels of Diatomic Molecules 522 291. Heitler and London Method for H 2 523 292. The Method of Molecular Orbitals 527 Problems 530 Chapter XLI FERMI STATISTICS AND METALLIC STRUCTURE Introduction 531 293. The Exclusion Principle for Free Electrons 531 294. Maximum Kinetic Eneroy and Density of Electrons. . . . 534 295. The Fermi-Thomas Atomic Model 535 296. Electrons in Metals 536 297. The Fermi Distrlbution. 540 Problems 543 XX CONTENTS Chapter XLII DISPERSION, DIELECTRICS, AND MAGNETISM Pagh Introduction , 545 298. Dispersion and Dispersion Electrons 546 299. Quantum Theory of Dispersion . 548 300. polarizability 549 301. Van der Waals' Force 551 302. Types of Dielectrics 553 303. Theory of Dipole Orientation 554 304. Magnetic Substances 556 Problems 558 Suggested References. 561 Index 565 INTRODUCTION TO THEORETICAL PHYSICS CHAPTER I POWER SERIES The first result of a physical experiment is ordinarily a table of values, one column containing values of an independent variable, another of a dependent variable. In mechanics, the independent variable is ordinarily the time, the dependent variable the displacement. In thermodynamics, we may have two independent variables, as volume and temperature, and one dependent variable, the pressure. With electric currents, we may have the current flowing in some part of the circuit as dependent variable, the electromotive force applied as inde- pendent variable, as when in a vacuum tube we measure plate current as function of grid voltage. In electromagnetic theory, the electric or magnetic field strength, the dependent variable, is a function of four independent variables, the three coordinates of space, and time. The relation between independent and dependent variable can be given by a table of values, by drawing a graph, or analytically by approximating the results by a mathematical formula. The last method is by far the most powerful, particularly if further calculations must be made using the experimental results, so that we are led to the study of mathematical functions. There are a good many well-known functions; for example, the algebraic func- tions, as ax + bx 2 ; the trigonometric functions, as sin (ax + 6); exponential functions, as ae -6 *; and rarer things like Bessel's functions, J n (x). It may be that, by inspection of the results, or for some theoretical reason, we may decide that some such well-known function can be used to describe our experimental data within the experimental error. But in actual physical 1 2 INTRODUCTION TO THEORETICAL PHYSICS problems, we meet many functions which are not included among these well-known forms. The question presents itself, can we ' not get some general method of describing functions analytically, equally applicable to familiar and unfamiliar functions? 1. Power Series.â€” Power series present one such general method, on the whole the most useful one. The simplest form of power series is A + A x x + A 2 x 2 + * â€¢ â€¢ , where the A's are arbitrary coefficients. By giving these coefficients suitable values, we can make the series approach any desired function as closely as we please, with some exceptions as we shall note below. As examples of common series, we have first the poly- nomials (in which all Aâ€ž's after a certain n are zero); and then many familiar infinite series, as (1 + x) n = 1 + nx + â€” ^ â€” 2| â€” x + 31 x + 2 3 " ' ' (1) e* - i + a; + |!+|! + . . â€¢ , (2) COS X = 1 - 7^ + Ti _ ^1 + ' ' * Â» ( 3 ) sin x = x â€” g| + ^ j â€” â€¢ â€¢ â€¢ (4) In fitting an experimental table of values, it is generally true that we cannot use one of these well-known series. We must determine coefficients to fit the data. A familiar process is that in which we know beforehand that the graph of the function should be a straight line. Then, either by actually plotting and estimating by means of a ruler, or by using least squares, we find the two constants of the linear relation y = a + bx. If the graph is slightly curved, we may be able to determine the constants of a parabola y = a + bx + ex 2 to fit it approximately. More complicated curves can be approximated by taking more terms. It is plain that, if there are n points determined experi- mentally, we can find a polynomial containing n coefficients which will just pass through them. But this is hardly a sportsmanlike thing to do, and generally we look for a function containing far fewer constants than the number of points we wish to -fit. In other words, in practice, rather than using infinite series, we are accustomed to use only the first few terms of such a series, , POWER SERIES 3 2. Small Quantities of Various Orders. â€” The general justifica- tion of this method of using only a finite part of a series comes from considering small quantities of various orders, as they are called. A power series is practically useful only if it converges rather rapidly; that is, if each term is decidedly smaller than the one before it. If we imagine that a physical relation is really expressed by a rapidly converging infinite series, then the sum of all the terms after a certain one will be smaller than the inevitable errors of experiment, and may be neglected, leaving only a polynomial. Suppose, for instance, that the linear dimen- sion d of a solid under pressure, expressed as a function of the pressure p, is given exactly by a series d = d Q â€” ap + bp 2 â€” â€¢ â€¢ â€¢ . For small pressures, the change of length ap will be small compared with d , and the second-order term bp 2 will be in turn small compared with ap (though of course this will not be true for much higher pressures, since ap will increase, and bp 2 will increase even more). We express this by saying that ap is small quantity of the first order, bp 2 a small quantity of the second order. It may well be that the second-order quantities are so small that we can neglect them, so that approximately d = d â€” ap. Now if we are interested in finding the way in which the volume, proportional to d 3 , changes with pressure, we have accurately ^3 = do s _ 3do * ap + (Zd a 2 + 3d 2 b)p 2 + â€¢ â€¢ â€¢ . (5) But we are assuming that ap is small compared with d , and bp 2 is small compared with ap, for all pressures used. Thus we readily see that the term in p 2 in this final expression (5) is small compared with the term in p, and can be neglected in comparison with the leading term d 3 , so that in d 3 , as in d, we can neglect the second order of small quantities. We could then have started with the abbreviated expression d = d â€” ap, and have obtained the same result for d 3 , to the first order. This method of cutting off infinite series at definite places, retaining only terms of a certain order, is very commonly used, and often is the only thing that simplifies computations with series enough to make them practically possible. But we must notice that the justification depends entirely on the physical situation, and can be different in different cases. Thus if we had to consider higher pressures in our problem above, we should have to retain the second-order terms, but perhaps could neglect 4 INTRODUCTION TO THEORETICAL PHYSICS third-order ones. One must always use good physical judgment in neglecting small quantities. Now, of course, in many cases we do not need to neglect high powers at all. The problems which we meet will often have simple enough relations between the coefficients of the successive terms so that we can write down as many terms as we please, without trouble, as we can with the binomial or exponential series. But it always pays to inquire, if the high terms of the series get too complicated to work with successfully, if they cannot be neglected. 3. Taylor's Expansion. â€” We have been speaking of series representing functions obtained from experiment, or about whiph we do not have much information. But it may be that we have to work with a function whose analytical properties we know, and in that case there is a standard method of finding its series expansion, known as Taylor's theorem. This is as follows: /(*) =/(o) +f(p)x +^r^ 2 +-^r* 3 +-.â€¢â€¢, W where f{x) is the function of x, /(0) means the value of the func- tion when x = 0, /'(0) is the first derivative for x = 0, and so on, so that f(x) = A + Aix + A 2 x 2 +â€¢â€¢â€¢', where A n = f n (0)/nl To justify this, we need only differentiate n times, obtaining very easily /"Or) = n(n - 1) â€¢ â€¢ â€¢ (2)(l)A n + (n + l)(n) â€¢ â€¢ â€¢ (2)A n+1 x + (n + 2)(n + 1) â€¢ â€¢ â€¢ (S)A n+2 x* + â€¢ â€¢ â€¢ = n\A n + ^ n >' A n+1 x + â€¢ â€¢ â€¢ . If now we let x = 0, all terms but the first vanish, so that we have /*(0) = n\ A n , or A n =/Â»(0)/n!. 4. The Binomial Theorem. â€” As an illustration of Taylor's expansion, we prove the binomial theorem, the expansion of (1 + x) n given in Eq. (1). We have f(x) = (1 + *)Â», fix) = nil + x) n ~\ fix) = nin - 1)(1 + xY~\ etc., by differentiation. Thus, setting x = 0, (1 + x) goes into 1, so that we have /(0) = 1, jf'(O) = n, f?(0) = n(n - 1), etc., and A = 1, Ai = n/l\, A 2 = n(n â€” \)/2\, etc. 5. Expansion about an Arbitrary Point.â€” A slightly more general expansion is obtained by shifting the origin along the x axis to a point a. The expansion is POWER SERIES 5 fix) = f(a) +f{a)(x - 'a) +^p(^ - a) 2 + â€¢ â€¢ â€¢ (7) From Taylor's theorem, we can see immediately a general condition which a function must satisfy if it can be expanded in power series about a given point (by expanding about a point we mean setting up an expansion in powers of x â€” a, if a is the given point). The function and all its derivatives must be finite at the point in question, since otherwise some coefficients of the expansion will be infinite. Thus for example we cannot expand 1/x in power .series in a;: we have/(0) = 1/0 = infinite, and all the derivatives are also infinite. Such a point is called a singular point of the function. But by expanding about another point we can avoid this difficulty. Thus we can expand 1/x about a, if a ^ 0; /(a) = 1/a, /'(a) = -1/a 2 , /"(a) = 1-2/a 3 , /'"(a) = -1-2-3M etc., so that 1 ' 1 _ (x - a) , {x - a) 2 _ (x - a) z . . x a a 2 ^ a 3 a 4 " r " w From this we can understand that a function can be expanded in power series about a point which is not a singular point. 6. Expansion about a Pole. â€” At some singular points, the function behaves like l/x n , an inverse power of x. Such a singularity is called a pole. If fix) has a pole of order n at the origin, then by definition x n f(x) has no singularity at the origin, and can be expanded in power series A + A\x â€¢ â€¢ â€¢ . Thus we have for f(x) the expansion an infinite series starting with inverse powers, but turning into anordinary series of positive powers after its nth term. A similar theorem holds for expansion about a pole at x â€” a. A singularity which is not a pole is called an essential singularity. An example of an essential singularity is that possessed by the function e~ 1/x at x = 0. This function approaches as # approaches through positive values, but becomes infinite as x approaches through negative values, and no inverse power 1/x" has such a behavior. 7. Convergence. â€” A series is said to converge if the process of adding its terms is one that can be carried out and that leads to a 6 INTRODUCTION TO THEORETICAL PHYSICS definite answer. Thus (1 â€” x)~ l , by the binomial theorem, is equal to 1 + x + x 2 + x 3 + â€¢ â€¢ â€¢ .' Now if x is less than unity, and we try to add these terms, we get an answer. For example, if x = 0.1, we have 1 + 0.1 + 0.01 + 0.001 + 0.0001 + â€¢ â€¢ â€¢ = 1.111 â€¢ â€¢ â€¢ , which equals (1 â€” 0.1) -1 = 1%, as it should. But if x is greater than unity, this no longer holds: if x = 2, we have 1 + 2 + 4 + 8+ â€¢â€¢â€¢, which certainly is infinitely great, and leads to no definite value. Another situation is obtained if we set x = â€” 1 in the series, when we have 1 â€” 1 + 1 â€” 1 + 1 â€¢ â€¢ â€¢ , a series which is said to oscillate (successive terms have opposite signs). As a matter of fact, we find that the series 1 + x + x 2 + x 3 + â€¢ â€¢ â€¢ , which is called the geometric series, converges if x is between â€”1 and +1, but does not converge if x is equal to or greater than 1, or equal to or less than â€”1. This series illustrates two of the simplest types of nonconvergence of series, the simple divergence, in which terms get greater and greater, and the oscillation, where the terms have the same order of magnitude but alternate sign. There is still another type of series which does not converge, sometimes called the semiconvergent or asymptotic series, whose terms begin to decrease regularly as we go out in the series, but after a certain point start in increasing, and eventually become infinite. These asymptotic series often can be used for computation, for it can be shown in many cases that, if we retain terms just up to the smallest one, the resulting sum is a good approximation to the function the series is supposed to represent. Our definition of convergence in the last paragraph was very crude. More exactly, a series converges if the sum of the first n terms approaches a limit as n increases indefinitely. This defini- tion agrees with the usual procedure of the physicist, for he often computes by series, and he does it by adding a finite number of terms. He carries this far enough so that adding more terms does not change the sum, to the order of accuracy to which he works, which essentially means that the sum is approaching a limit. To tell whether a given series converges is not always easy. In the first place, we can be sure in some cases that given Taylor's expansions cannot converge if the argument (that is, the inde- pendent variable), has too large a value. Thus 1 + x + x 2 â€¢ â€¢ â€¢ does not converge if x is equal to, or greater than, 1, and we could have seen this from the fact that the series equals 1/(1 â€” x), POWER SERIES 7 which has a singularity for x = 1 (being equal to %). Thus the function is infinite for re = 1, and the series to represent it could not converge. And increasing x beyond 1 cannot make the series converge again. In fact, as soon as the variable in a series becomes greater than the value for which the function has a singularity, the series will diverge. But it is a little more com- plicated than this would seem, for 1 + x + x 2 + â€¢ â€¢ â€¢ diverges also for x less than â€”1, and there is no singularity here. As a matter of fact, a power series converges in general so long as the argument is less in absolute value than the smallest value for which there is a singularity, but not beyond. But this singu- larity can come from imaginary or complex values of the argu- ment, so that we might well miss it completely if we did not consider imaginary values. For this reason, this criterion for convergence is rather tricky. When we actually examine a series, we can often tell whether it converges or not. Surely a series cannot converge unless its successive terms get smaller and smaller. We can investigate this by the ratio test, taking the ratio of the nth term to the one before, and seeing how this ratio changes as we go out in the series. If the limiting ratio is less than 1, the series converges; if it is greater than 1, it diverges. If the ratio is just 1, the test gives no information. Thus for example with the series x + x 2 /2 + x 3 /3 + â€¢ â€¢ â€¢ , the ratio of the term in x n to that in x n ~ l is â€” â€” ^zj- = x. As n approaches infinity, n â€” 1 and n tb 00 lb become approximately equal, so that the ratio approaches x. Thus we see that if z is less numerically than unity, this series converges; if x is greater than unity, it diverges; if x = 1, we cannot say. From other information, we know that the series when x = 1, which is 1 + 1/2 + 1/3 + 1/4 + â€¢ â€¢ â€¢ , diverges. But with the similar series x + x 2 /2 2 + x s /3 2 + â€¢ â€¢ â€¢ , where the ratio of terms also approaches x as we go out in the series, and the series again diverges for x greater numerically than unity, converges for x less than unity, we have just the other situation at x = 1: the series 1 + 1/2 2 + 1/3 2 + â€¢ â€¢ â€¢ converges. Often a series can be approximately summed by comparison with an integral. Thus 1 + 2 n + 3^ + ' ' " = 2^ = J ^ approximately. 8 INTRODUCTION TO THEORETICAL PHYSICS The approximation is rather poor for the small values of z, but becomes better for large z values, on which the convergence depends. It would be a better approximation, for instance, to write ttt + t^ â€” h ' â€¢ ' = I â€” â€¢ From this we see that the 10 n ll n J 10 z n series converges when n > 1, the integral being -. â€” _ ..^ W-I which is zero at the upper limit, but diverges if n Z 1. For n â€” 1, for instance, the integral becomes logarithmically infinite at z = oo . Problems 1. Plot 1 as a function of x, and show that it has a minimum at x a, 2 x = 2. Expand in Taylor's series about this point, obtaining an expansion y = A + Az(x â€” 2) 2 + A s (x â€” 2) 3 + â€¢ â€¢ â€¢ , where necessarily the coeffi- cient A-l is zero. Now plot on the graph the successive approximations y = A , y = A + A 2 (x - 2) 2 , y = A + A 2 (x - 2) 2 + A,(x - 2) 3 , y = A + A 2 (x â€” 2) 2 + A 3 (x â€” 2) 3 + A 4 (x â€” 2) 4 , observing how they approxi- mate the real curve more and more accurately. 2. a. Derive the series for the exponential, cosine and sine series, directly from Taylor's theorem. 6. Differentiate the series for sin x term by term, and show that the result is the series for cos x. 3. In the series for e x , set x = 1, obtaining a series for e. Using this series, compute the value of e to four decimal places. 4. Why does one always have series for In (1 + x) in powers of x, rather than for In x? From the series for In (1 + x), compute logarithms to base e of 1.1, 1.2, 1.3, 1.4, 1.5. 6. The function l/(x â€” i), where i = \/â€” 1> has a singularity for x = i, but not for any real value of x. Show that nevertheless the series expansion about x = diverges for x greater than 1 or less than â€”1, obtaining the power series by Taylor's theorem, and separating real and imaginary parts of the series. This is an example of a case where the series diverges on account of singularities for complex values of x. 6. As a result of an experiment, we are given the table of values following: x V 1 7.0 2 11.1 3 15.2 4 19.3 5 23.2 6 27.1 7 30.8 8 34.5 9 38.2 10 41.7 POWER SERIES 9 Try to devise some practicable scheme for telling whether this function (in which, being a result of experiment, the values are only approximations), can be represented within the error of experiment by a linear, quadratic, cubic, etc., polynomial. Get the coefficients of the resulting series, and use them to find the value of the function and its slope at x = 0. Plot the points, the curve which approximates them, and the straight-line tangent to the curve at x = 0. It is legitimate to use graphical methods if you wish. 7. Expand tan -1 a; in a power series about x = 0. Hints: (a) -j- tan" 1 (x) = dx v ' 1 + x % w r^-2 = 1 ~ x * + xi - ** + â€¢ t- (tan -1 x) dx = tan -1 x + c. <â€¢>/; What is the range of convergence of the resulting series? Calculate from this series the value of tt/4 = tan -1 1 correct to 5 per cent. How many terms of the series are necessary to obtain this accuracy? 8. By a procedure analogous to that used in Prob. 7 expand sin -1 sin a power series about x â€” 0. Find the range of convergence for this series. 9. From the known Taylor's series for e x , write the corresponding series for e~Â» 2 . By integrating this series obtain to 1 per cent a value for ,o e ~ x2dx > whose correct value is 0.748. ... 10. Make use of the binomial theorem to obtain an expansion of VI + -y/x in ascending powers of xV*. What is the range of convergence? 11. Discuss by the ratio test the convergence of the following series: (a.) x + x*/2 + x 3 /3 + xV4 + â€¢ â€¢ â€¢ (6) x + x72 2 + a;V3 2 + x*/4* + â€¢ â€¢ â€¢ (c) The binomial expansion of (1 + x) k , for nonintegral k. (d) The series for e x , sin x, cos x. s: CHAPTER II POWER SERIES METHOD FOR DIFFERENTIAL EQUATIONS Most important physical laws involve statements giving the relation between the rate of change of some quantity and other quantities. Such a relation, stated in mathematical language, is a differential equation â€” an equation containing derivatives of functions, as well as the functions themselves. For example, the fundamental law of mechanics is Newton's second law of motion: the force equals the time rate of change of the momentum. Or in electricity, in a circuit containing an inductance, the back electromotive force of the inductance equals a constant times the time rate of change of the current. But these differential relations are not in the form which can be used in making direct connection with experiment. One cannot directly plot graphs, or give tables of values, from them. One must rather solve the differential equations, that is, find algebraic relations between the variables, containing no differentiations, but consistent with the differential equations. For most of our course we shall be interested in finding such solutions of differen- tial equations. Solving differential equations is rather like integrating func- tions: there are no general rules. Individual cases must be treated by appropriate special methods. We shall meet some such special rules, and shall make much use of some of them. Those who have studied differential equations have learned a variety of such rules. But rather more important on the whole is a method which is applicable, though not always most con- venient, in a very large number of cases: the method of power series. In general, the solution of a differential equation consists of a certain functional relation between variables. If we assume that this function is expanded in power series, our only problem is to determine the coefficients. And by substituting the series back into the differential equation, we can very often get condi- tions for determining them. We shall illustrate the method by examples. 10 POWER SERIES METHOD FOR DIFFERENTIAL EQUATIONS 11 8. The Falling Body. â€” Imagine a body moving vertically under the action # of gravity. To describe its motion, we have an independent variable, the time t, and a dependent variable, the height x. Let the mass of the body be m, and let its velocity, which is of course dx/dt, be also called v. The force acting on it is F. Then Newton's law states that F = \, J > where mv dt is the momentum. If the mass is constant (which does not always have to be the case, as we shall see in Prob. 7), we can rewrite the equation as F = mdv/dt, or =ma, where a is the acceleration. Substituting v = dx/dt, this is also F = md 2 x/dt 2 . These are all forms of Newton's second law, written as differential equations. We shall first take the case where the force, like that of gravity on the earth's surface, is constant : F = constant = â€” mg, where g is the acceleration of gravity, and where the nega- tive sign means that the force is downward. Then we have t? dv d 2 x ... F=-mg = m m = m^, (1) or d 2 x/dt 2 = dv/dt = â€” g. These can be solved at once, by direct integration: integrating once with respect to t, dx/dt = v = constant â€”gt = v â€” gt, where y , the constant of integration, obviously means the value of the velocity when t = 0. Integrat- ing again, and calling the second constant of integration x , we have x = x + v t â€” \gt 2 , containing now two arbitrary con- stants, the initial position and initial velocity. The presence of such arbitrary constants is the most characteristic feature of the solutions of differential equations. And we note that the number of arbitrary constants equals the number of integrations we must perform to get rid of the differentiations. If the dif- ferential equation is one of the first order (with only first deriva- tives in it), there will be one arbitrary constant in the solution; if it is of the second order (second derivatives), there will be two, and so on. And always the arbitrary constants must be deter- mined so as to satisfy certain "initial conditions," such as the values of the position and velocity at t = 0. 9. Falling Body with Viscosity.â€” With the problem of the falling body, the solution has automatically come out as a poly- nomial in t, which is simply a power series that breaks off, so that there is no need of more complicated methods. But now let us take a more difficult case: we assume the body to be falling 12 INTRODUCTION TO THEORETICAL PHYSICS through a viscous medium under the action of gravity. Here the force is a sum of two parts: gravity, â€” mg, % and a frictional force depending on velocity. It is found experimentally that for small velocities this frictional force, in a viscous medium, is proportional to the velocity, with, of course, a negative coeffi- cient, since it opposes the motion, changing signs with the velocity. Let it be called â€” kv, k being the coefficient, which depends in a complicated way on the shape and size of the body, and is pro- portional to the coefficient of viscosity of the fluid. Then we have dv 7 m-j = â€” mg â€” kv, or m-j + kv = â€” mg. (2) This is a simple sort of differential equation, in a standard form. It is 1. A linear differential equation. That is, it contains v and its derivatives (as v, dv/dt, d 2 v/dt 2 , etc.) only in the first power (in dv/dt, kv), or the zero power ( â€” mg, independent of v), not as squares or cubes [as, for example, (dv/dt) 2 }, or products (as v dv/dt). 2. A differential equation of the first order (containing no derivative higher than the first). 3. An inhomogeneous equation (it contains terms of both the first power and the zero power in v and its derivatives, while a homogeneous equation contains only terms of tlie same power, as all of the first power. That is, if the term â€” mg were absent, the equation would be homogeneous). We cannot solve Eq. (2) by direct integration, for if we inte- grate with respect to t, one term would be jv dt, which we cannot evaluate, since v is an unknown function of time. Thus we must proceed differently. Let us assume that v is given by a power series in the time* v = A + Ait -\- â€¢ â€¢ â€¢ , and try to determine the coefficients. We do this by substituting the series in the equation. We have by direct differentiation ^ = A t + 2A 2 t + 3A 3 * 2 +â€¢â€¢â€¢+(Â» + l)A n+l t n '+â€¢â€¢â€¢. Then, substituting, we have m[Ai + 2A 2 t + 3A s t 2 + â€¢ â€¢ â€¢ + (n + l)A n+l t n + â€¢ â€¢ â€¢ J + k(A + Ait + AJ* +'â€¢â€¢â€¢+ Aj n +â€¢â€¢â€¢) = â€”m- POWER SERIES METHOD FOR DIFFERENTIAL EQUATIONS 13 Rearranging, (A: + Â±A + g) + (2A 2 + Â±A x )t + â€¢ â– â€¢ + \{n + l)A n+l + ^A n \Â» + â€¢ â€¢ â€¢ = 0. (3) This states that a certain power series in t is equal to* zero, for all values of t. But the only function of t which is always zero is zero itself, and by Taylor's theorem the expansion of zero in power series is a series all of whose coefficients are zero. Thus Eq. (3) can only be satisfied, for all values of t, if each coefficient vanishes : k 2A 2 + -A x = (4) (n + 1)Aâ€ž +1 + ^-A n = m Here we have an infinite set of equations to solve for the coeffi- cients A. Fortunately they are so arranged that we can solve them, getting all A'b in terms of A , if we start with the first and work down: 4,= & Ik. lk/k A , \ ^ Ao + g ) (5) A _ _lh 1 k*/k As " 3m A * ~ ~Z\lAm A n+1 = - â– * -A. = (-1)*+* ] â€ž ,â€”(-Ao + g\ (n + 1) m v ' {n + V)\m n \m y J And the power series is v = A + A x t + â€¢ â€¢ â€¢ -^ + (^' + Â»X- ( + 5=Â« , -S!^+- : -> (6) Thus we have the solution. If we set t = 0, we have v = A , so that A is simply the initial velocity, and is the arbitrary constant 14 INTRODUCTION TO THEORETICAL PHYSICS which we meet in the solution. We could compute from our series the value of v at any time t, knowing the initial velocity. It happens in this case that we can recognize the infinite series as representing a familiar function. For we have k. , 1 fc 2 1 k s --Z4 m e = 1 - m 2 ! ra 2 3 ! w 3 which has close connection with our series, so that we can write at once v = A + lâ€”Ao + g (-Ao + g) (e"='-l) m mg k/m -(A. + %y* mg k ' (7) mg Fig. 1. â€” Velocity of damped falling body, with various initial conditions. We Can see the physical properties of the solution most clearly from the graph in Fig. 1, No matter what the initial velocity may have been, the particle finally settles down to motion with a constant speed, given by â€”mg/k. The initial velocity is A , and if this is greater than the final velocity, the body slows down; if it is less, it speeds up, to attain this final speed. 10. Particular and General Solutions for Falling Body with Viscosity. â€” It is instructive to notice that we can solve our POWER SERIES METHOD FOR DIFFERENTIAL EQUATIONS 15 problem in an elementary way. Our equation is mdv/dt + kv = â€” trig. Plainly a particular solution is given by assuming a constant velocity. Then dv/dt is zero, so that the equation is kv = â€”rag, or Â» = â€”mg/k. But this is not the most general solution, for it does not have an arbitrary constant; it represents merely the particular case in which the initial velocity happened to be just the correct final value, and is unable to describe any other initial condition. To get a general solution, we proceed as follows: we take the homogeneous equation mdv/dt + kv = 0, which we obtain from our inhomogeneous equation by leaving out the term â€” mg. We can easily solve this: writing it dv/v = â€” (k/m)dt, and integrating, we have In v = â€” (k/m)t + con- stant, and taking the exponential, v = constant X e~ (k/m) ', where the constant is arbitrary. Then the sum of this general solution of the homogeneous equation, and the particular solution â€” (m/k)g of the inhomogeneous equation, is the solution we desire. We may prove this easily. For we have \ m Jt + fcjw 6 "* / = Â°- Adding, showing that the function Ce~ {k/m)t â€” (m/k)g satisfies the differ- ential Eq. (2). The procedure we have just used is an illustration of the general rule : A general solution of an inhomogeneous equation is obtained by adding a particular solution of the inhomogeneous equation, and a general solution of the related homogeneous equation. In this statement, the terms "particular solution" and "general solu- tion" are used in a technical sense: a "particular solution" is one which satisfies the differential equation but has no arbitrary constants; a "general solution" is one which has its full comple- ment of arbitrary constants. The proof of the rule in general is carried out just as in our case, adding the particular solution of the inhomogeneous equation and a general solution of the homo- geneous equation, and showing that the sum satisfies the inhomo- geneous equation. One thing should be noted: the properties we have been discussing depend entirely on the linear character 16 INTRODUCTION TO THEORETICAL PHYSICS of the differential equation, for it is only with linear functions / that /(xi) + f(x 2 ) = f(xi + x 2 ). 11. Electric Circuit Containing Resistance and Inductance.â€” The theory of the electrical circuit reminds one in many ways of mechanical principles: electric current is analogous to velocity, charge to displacement, electromotive force to mechanical force. Thus in a circuit containing a resistance, inductance, and con- denser, all in series, the current can flow through the circuit, piling up in the condenser because it cannot flow through. Let q be the charge on one plate of the condenser ( â€” q being the charge on the other), and let i be the current flowing through the circuit toward the condenser plate in question, so that the current measures just the amount of charge per second flowing onto the condenser plate, or i = dq/dt (as v = dx/dt). Now let the coefficient of self-induction of the circuit be L, the resistance R, the capacity of the condenser C. Then there are three e.m.fs. (electromotive forces) acting on the current, in addition to a possible external e.m.f. E from a battery: the back e.m.fs. of di induction, resistance, and capacity. The first is â€” L-jg the electromotive force induced in a circuit when the current changes; the second is â€” Ri, the value familiar from Ohm's law; the third i s â€”q/C, as given by the elementary law of the condenser. These are all negative, for they act to oppose the current. Now the law of the circuit is that the total e.m.f. acting on the circuit is zero: -Lf-tt-Â£ + *-0, or T 'Jt ^ "' ' C L d 4 t + Ri + Â£=E. (8) This is a differential equation. Let us take the special case where there is no condenser, so that the equation is Ldi/dt + Ri = E. The equation is then exactly analogous to the equation mdv/dt + kv = F, which we had for a falling body with viscosity. And we see that self-induction is analogous to inertia, resistance to viscosity. The analogy is often valuable. If now the applied e.m.f. E of the battery is constant, the problem can be solved mathematically just as before, and we find i = constant X e~ (B/L)t + E/R. The first term is the transient POWER SERIES METHOD FOR DIFFERENTIAL EQUATIONS 17 effect, of arbitrary size, as we see from the arbitrary constant, rapidly dying out as time goes on, while the second is the constant value given by Ohm's law, the value to which the current tends if we wait long enough. " Problems 1. Show that the solution v = (A + mp/fc)e -( */ m) ' â€” .mg/k reduces properly to uniformly accelerated motion in the limiting case where the viscous resistance vanishes. Illustrate this graphically, showing curves for several different k's, and finally for k = 0, all with the same initial velocity. 2. A raindrop weighs 0.1 gm., and after falling from rest reaches a limiting speed of 1,000 cm. per second by the time it reaches the earth. How long did it take to reach half its final speed? Nine tenths of its final speed? How far did it travel before reaching half its final speed? For how long could its velocity be described by the simple law v = â€”gt to an error of 1 per cent? 3. At high velocities, the viscous resistance is proportional to the third power of the velocity. Assuming this law, set up the differential equation for a particle falling under gravity and acted on by such a viscous drag. Solve by power series, obtaining at least four terms in the expansion for v as a function of t. Draw graphs of velocity as function of time, and discuss the solutions physically. 4. Using the same law of viscosity as in the preceding problem, but assum- ing no gravitational force, solve by direct integration of the differential equation for the case of a particle starting with given initial velocity and being damped down to rest. Show by Taylor's expansion of this function that it agrees with the special case of the power series of the preceding problem obtained by letting the gravitational force be zero. 6. A large coil has a resistance of 0.7 ohm, inductance of 5 henries. Until t = 0, no current is flowing in the coil. At that moment, a battery of 5 volts e.m.f. is connected to it. After 5 sec, the battery is short-circuited and the current in the coil allowed to die down. Compute the current as function of the time, drawing a curve to represent it. 6. A coil having L = 10 henries. R = 1 ohm, has no current flowing in it until t = 0. Then it has an applied voltage increasing linearly with the time, from zero at t = 0, to 1 volt at t = 1 sec. After t = 1, the e.m.f. remains equal to 1 volt. By series methods find the current at any time, and plot the curve. ' 7. Suppose we have a rocket, shot off with initial velocity v , and there- after losing mass according to the law m = m (l â€” ct), where m is the mass at any time, m the initial mass at t = 0, c is a constant, and where the mass lost does not have appreciable velocity after it leaves the rocket. Show that on account of the loss of mass the rocket is accelerated, just as if a force were acting on a body of constant mass. The rocket is acted on by a viscous resisting force in addition. Taking account of these forces, find the differ- ential equation for its velocity as a function of time, and integrate the equa- tion directly. Now find also the solution for v as a power series in the time. Show that the resulting series agrees with that obtained by expanding the 18 INTRODUCTION TO THEORETICAL PHYSICS exact solution. Calculate the limiting ratio of successive terms in the power series, as we go out in the series, and from this result obtain the region of convergence of the series. Is this result reasonable physically? What happens in the exact solution outside the range of convergence? 8. In a radioactive disintegration, the number of atoms disintegrating per second, and turning into atoms of another sort, is simply proportional to the total number of radioactive atoms present. Write down the differential equation for the number of atoms present at any time, and find its solution. Assuming that half the atoms of a sample of radium disintegrate in 1,300 years, how many would decay in the first year? 9. If at the same time radium were being produced at a constant rate by disintegration of uranium, how would this change the situation in the preceding problem? Set up the new differential equation. Assuming that we start without any radium, but with pure uranium, find the amount of radium as a function of the time. Show that the amount of radium approaches an equilibrium amount, which it reaches in time, whether the initial amount of radium is greater or less than the equilibrium amount. 10. Find a series solution for the differential equation m dv/dt + kv = c/t, where c is a constant, representing a damped motion under the action of an external force which decreases inversely proportionally to the time, the series having the form v = on/t + aÂ»/t* + â€¢ â€¢ â€¢ . Show that this series is divergent for all values of t. Show that the differential equation is formally satisfied by the expression v = er* J ^ ~dt. This solution is convergent for t negative. The integral | j dt is known as the exponential integral func- tion, and is important in physics and mathematics. It is frequently calcu- lated by using the above divergent series. Explain how this procedure might be valid. 11. Suppose a particle is acted on by a damping force proportional to the velocity, and to a force which varies sinusoidally with the time. Solve the resulting differential equation for velocity as function of time, by the series method, by expanding the force in power series in the time. Can you recognize the analytical form of the resulting power series? d 2 v 1 dtf 12. Solve by power series Bessel's equation ^ + - ^ + y =0. The result is Bessel's function of the zero order, Jo 0c). From the series, plot J (x) for x between and 5. 13. The equation for Bessel's function of the mth order, J m {x), is ^ + i d y. + (\ _ â€” 2 V = 0. Solve by power series, showing that the first x dx \ x 2 J term in the expansion is that in x m . Plot Ji(x) for x between and 5. Bessel's functions oscillate, like the sine and cosine, all the way to infinity. We shall use them in discussing standing waves in a circular membrane, and for many other problems. The second independent solution of the equation is infinite at the origin, and hence cannot be expanded in power series. CHAPTER III POWER SERIES AND EXPONENTIAL METHODS FOR SIMPLE HARMONIC VIBRATIONS In the last chapter we have found a general method of power series for solving differential equations, and have applied it to the problem of motion under viscous forces. Next we consider the same method, applied to somewhat different problems: a particle acted on by restoring forces proportional to the distance, or an electric circuit containing inductance and capacity. 12. Particle with Linear Restoring Force. â€” Suppose that the force acting on a particle is proportional to the displacement from a fixed position, and opposite to the displacement, a so-called linear restoring force. This force is -kx, if x is the displace- ment, k a constant. For the moment we assume that there is no gravitational or other external force acting. Then the equa- tion of motion is md 2 x/dt 2 = â€” kx, or m dfi + kx = (1) This is a homogeneous linear differential equation of the second order, with constant coefficients (that is, m, k are independent of time). We solve it in series as before. If x = A + Ait + A 2 t 2 + â€¢ â€¢ â€¢ , we have immediately, by the method used before, (2mA 2 + kAo) + (3 â€¢ 2mA 3 + kA x )t + (4 â€¢ SmA* + kA 2 )t 2 + â€¢ â€¢ â€¢ =0. Thus, setting the separate coefficients equal to zero, and solving one equation after the other, we find a 1Jc A a 1 * A A2 = -2m Ao > As = -3lm Al > (2) These equations determine all the coefficients in terms of two arbitrary ones, A and Ai, which are the two arbitrary constants 19 20 INTRODUCTION TO THEORETICAL PHYSICS to be expected in the solution of a second-order differential equation. The solution may be written x = A<_ + A x-^ + imV- ' Z\m 4 !\w/ (3) We now observe that these series represent well-known functions: the first is the cosine, the second the sine, except for a factor, so that we have x = A cos \/hJm t + AtVm/k sin \/k/m t. (4) Thus the motion is a periodic one, as shown by the sinusoidal functions. The period T is found from the fact that when t increases by T, the sine or cosine must come back to its initial value, which it does when its argument (that is, the thing whose cosine we are taking), increases by 2tt. Thus Vk/m T = 2t, T = 2TrVWk, the familiar formula for the period in simple harmonic motion. From this, the frequency v is given by v = l/T = (l/27r)\/fcM and tne angular velocity co by co = 2ttv = y/k/m. It is often convenient to use these relations in rewriting the equation of motion, writing it d*x/dt* + co 2 z = 0, or d 2 x/dt 2 + 47rVx = 0. (5) 13. Oscillating Electric Circuit.â€” In the last chapter, we have seen that the equation for an electric circuit containing resistance, inductance, and capacity, is L di/dt + Ri + q/C = E, where i is the current, q the charge on the condenser, and E the impressed electromotive force. We also saw that i = dq/dt. Substituting, we obtain 4' + 4+Â§ = * (6) This is an inhomogeneous second-order linear differential equation for q, which becomes homogeneous if E = 0. We consider that case, and in particular let R be zero. Then the problem becomes mathematically equivalent to the preceding one, and has the differential equation d*q/dt* +j/LC = 0. The solution is q = A cos VYJLC t + AiVLC sin Vl/LC t, so that the current oscillates in the circuit. By differentiating, we can find the current directly instead of the charge: i = dq/dt = POWER SERIES AND EXPONENTIAL METHODS 21 -A Vl/LC sin Vl/LC t + A x cos Vl/LC t, so that the oscillations of charge and current are similar. The period of oscillation is given by T = 2w\/LC, increasing as either the inductance or the capacity becomes large. 14. The Exponential Method of Solution. â€” We have found that the solutions of our vibration problems, as well as of several other differential equations, come out either as exponential functions, or as sines or cosines. As a matter of fact, any homogeneous linear differential equation, with constant coeffi- cients, has such solutions. On account of the importance of this type of equation, we shall consider its solution specially. Let us take a second-order differential equation, g + Â«g+*-Â«. â€¢ (7) a type which includes the mechanical and electrical problems we have worked with. We can show very easily that this has an exponential solution, y = e kx . For let us substitute this function into the equation. We have dy/dx = ky, d 2 y/dx 2 = k 2 y, so that the equation becomes (k 2 + ak + b)e kx = 0. This equation is factored, and since e kx is not always zero, the other factor must be, and we have k 2 + ak + 6 = 0, or solving the quadratic by formula, k = -a/2 Â± Via/2) 2 - &. Thus if k eq uals either k x = -a/2 + V(Â«/2) 2 - b, or k 2 = â€”a/2 â€” V(a/ 2 ) 2 â€” b, e kx is a solution of the equation. We have, in fact, two independent solutions. Now if we have two independent solutions of a second-order linear homogeneous differential equation, we can readily show that any linear combination of them is itself a solution. If such a solution has two arbitrary constants, it is a general solution. Thus we can write the general solution of Eq. (7) y = Ae klX + Be k * x , or y = e -(a/2)s[_4 e v / (Â«/2)*-&s _|_ Â£ e -V(Â«/2) *-&*]. (8) This is the solution, with its two arbitrary constants, and it might seem as if no further discussion were necessary. But there is an interesting feature still to consider: the quantity (a/2) 2 â€” 6 under the radical may easily be negative, and the square root imaginary, so that we have to investigate the exponentials of imaginary quantities. 22 INTRODUCTION TO THEORETICAL PHYSICS Suppose, for example, that the damping term is zero: a = 0, and the differential equation is d 2 y/dx 2 + by = 0. This is the only case we have so far worked out in detail. Then the solution becomes y = Ae*^ 1 * + Be'*^ 1 *, where i = yf^l. But we have already seen that the solution of this same equation is C cos y/bx + D sin y/bx. If both forms are right, there must be connections between exponential and sinusoidal functions, which we now proceed to investigate. 15. Complex Exponentials. â€” Let us investigate the function e ix by series methods. We have at once e ,x - 1 + tx - 2] _ Iff "+~ 41 * " -( l -*+t---) + i ('-a + --;) or e ix = cos x -\- i sin x. Similarly we have e -ix _ cos x â€” i sin x. (9) We can solve for cos x by adding these equations and dividing by 2, or for sin x by subtracting and dividing by 2r. e ix _|_ e -ix ^ e ix _ g-XX cos a; = H Â» sin a; = ^ ( 10 ) These theorems are fundamental in the study of exponential and sinusoidal functions. In terms of the formulas of the last paragraph, we can readily see that our two formulations of simple harmonic motion are both correct. For we have = A (cos y/bx + i sin y/bx) + B (cos y/bx â€” i sin y/bx) = (A + B) cos y/bx + i(A â€” B) sin y/bx, or one constant times the cosine plus another times the sine, which is the more familiar solution. By giving A and B suitable complex values, we can have both coefficients real. But to know how to do this, and to understand the whole process, we should study complex numbers for themselves. Let us then make a little survey of the theory of complex numbers. POWER SERIES AND EXPONENTIAL METHODS "23 / / &/ / x / 0^ ^* 16. Complex Numbers. â€” A complex numb er is usually written A + Bi, where A and B are real, i = s/ â€” 1. It is often plotted in a diagram: we let abscissas represent real parts of numbers, ordinates the imaginary parts, so that A measures the abscissa, B the ordinate, of the point representing A + Bi. Every point in the plane corresponds to a complex number, and vice versa. All real numbers lie along the axis of abscissas, all pure imagi- naries along the axis of ordinates, and the other complex numbers between. But it is also often convenient to think of a complex number as being represented, not merely by a point, but by the vector from the origin out to the point. The fundamental reason for this is that these vectors obey the parallelogram law of addition, just as force or velocity vectors do (see Fig. 2). The vector treat- ment is suggestive in many ways. For example, we can consider the angle between two Complex num- bers. Thus, any real number, and any pure imaginary number, are at an angle of 90 deg. to each other. Or, the number 1 -+â– i is at an angle of 45 deg. with either 1 or i. When a complex number is regarded as a vector, we can describe it by two quantities: the absolute magnitude of the vector, or its length, \/A 2 + B 2 ; and the angle which it makes with the real axis, or tan -1 B/A. The vector representation of complex numbers has very close connection with complex exponential functions. Let us consider the complex number e ie , where is a real quantity. As we have seen, this equals cos + i sin 0, so that the real part is cos 0, the imaginary part sin 0. The vector representing this number is then a vector of unit magnitude, for V cos 2 + sin 2 = 1. Further, it makes just the angle with the real axis. We cau see interesting special cases. The number eâ„¢' 2 = i, as we can see at once from the vector diagram, or from the fact that it 3 C A E Fig. 2. â€” Law of addition of com- plex vectors. The vector E + Fi represents the vector sum of A + Bi and C + Di. Evidently OE = OA + AE = OA + OC, and OF = OB + OD. Hence E + Fi = (A + O + (B + D)i. ,2-ri â€” equals cos t/2 + i sin x/2 = i. Similarly e" = â€” 1, e* â€¢ = 1. This Jast result shows that the exponential e 4*i = 24 INTRODUCTION TO THEORETICAL PHYSICS function of an imaginary argument is periodic with period 2iri, similarly to the sine and cosine of a real argument. Next we look at the number re i0 , where r, 6 are both real. It differs from e ie in that both real and imaginary parts are multi- plied by the same real factor r, which simply increases the length of the vector to r, without changing the angle. Thus re ie is a vector of length r, angle 0. As a result, we can easily write any complex number in complex exponential form: A + Bi = re ie , where r = VA 2 + B 2 , 6 = tsar 1 B/A, or A = r cos 6,B = r sin (see Fig. 3). We may use these results in showing what happens when two com- plex numbers are multiplied together. Suppose we wish to form the product (A + Bi) (C -f Di) . Of course, multiply- ing directly, this equals (AC â€” BD) + (AD + BC)i, so that we can easily find real and imagi- nary parts of the product, but this is not very informing. It is better to write A -J- Bi = r x e ie \ C + Di = r 2 e ie \ Then the product is (rxe iei )(r%e i6i ) = (ri/- 2 )e i(01+e2) . That is, the mag- nitude of the product of two complex numbers is the product of the magnitudes, and the angle is the sum of their angles. Suppose we have a complex number re ie , and consider the closely related number re~ ie . The second is called the conjugate of the first. If we have a complex number in the form A -f Bi, its conjugate is A â€” Bi. Or in general, if we change the sign of i wherever it appears in a complex number, we obtain its conjugate. Graphically, the vector representing the conjugate of a number is the mirror image of the vector representing the number itself, in the axis of real numbers. Now conjugate numbers have two important properties: the sum of a number and its conjugate is real (for the imaginary parts just cancel in taking this sum), and the product is real (for this equals Fig. A=rcos0 A -The complex number P equals either A + Bi, or re %e . POWER SERIES AND EXPONENTIAL METHODS 25 r 2 e i(e-6) = r 2) The second fact is useful in finding the absolute magnitude of a complex number: if z is complex, z its conjugate (this is the usual notation), then -y/zi equals the absolute magnitude of z. From the other fact, we may find the real and i - % -Iâ€” 5 imaginary parts of complex numbers: â€” ^ â€” equals the real part z â€” z of z, and as we can easily show, ~. equals the imaginary part. We see examples in our relations between sinusoidal and exponential functions where e~ ix is the conjugate of e ix , so that gix _|_ a â€” ix- 2 should, and does, equal the real part of e ix , or cos x, oix ff~i- x and ~-. equals the imaginary part, or sin x. 17. Application of Complex Numbers to Vibration Problems. â€” There are two different, though related, ways of applying com- plex numbers to vibration problems. The first, and perhaps more logical, is directly suggested by what we have done. We found for undamped vibrations that y = Ae iy /~ hx + Be~ i ^ x . Now naturally we wish y to be real, since it represents a real displacement. To do this, we make use of the proposition that we have just found, that the sum of a complex number and its conjugate is real. Since e^V*"* i s the conjugate of e 1 ^", we achieve the desired result if we make B = A, for then the whole second term is just the conjugate of the first. Incidentally, C if we write A = -= e~ ia , we have y = - e i(Vb*-cc) _|_ ^(Vfrs-oO = C cog (y/b x _ a)> (U) giving a form, in terms of amplitude C and phase a, which is often useful and important. The second method of treatment is more common, particularly in electrical applications. Suppose we work directly with the complex solution y = Ae i ^ /hx , but consider that only the real part is of physical significance. This real part, as we have seen, is half the sum of this quantity and its conjugate, so that, except for a factor of 2, it comes to the same thing we have considered before. However, it is often easier to think of it in this way, and the process of using a complex solution, and finally taking the real part, is very common. Of course, if A is real, 26 INTRODUCTION TO THEORETICAL PHYSICS the real part is simply A cos y/bx; if A is complex, we may write it Ce - **, and the real part of the product is C cos {\/bx â€” a). This second method is particularly interesting in discussing simple harmonic motion, where x is replaced by t, and y/b by to, so that we are considering the real part of â– Aeâ„¢*. The complex number is given by a vector of length A, rotating in the complex plane with angular velocity w. And its real part is simply the projection of the vector along the real axis. Thus it corresponds exactly to the most elementary formulation of simple harmonic motion, as the projection of a circular motion on a diameter. Problems 1. Show directly that the solution A sin at + B cos at for the particle moving with simple harmonic motion can also be written C cos (at - a). Find C and a. as functions of A and B, and vice versa. The constant C is called the amplitude of the motion, and a is called the phase. Note that a can be regarded as an angle, measured in radians. 2. A pendulum 1 m. long is held at an angle of 1 deg. to the vertical, and released with an initial velocity of 5 cm. per second toward the position of equilibrium. Find amplitude and phase of the resulting motion. 3. A circuit contains resistance, inductance, and capacity, but there is no impressed e.m.f. Solve the differential equation in series, and show by comparison of the first few terms that the series represents the function e -iR/iDt(A s i n w t + B cos at), where a 2 = 1/LC â€” R 2 /4JL 2 . 4. In an oscillatory circuit, show that the phases of the charge and the current differ by 90 deg. 5. Given a complex number' represented by a vector, what is the nature of the vector representing its square root; its cube root? Find the three cube roots of unity, the four fourth roots, the five fifth roots, plotting them in the complex plane, and giving real and imaginary components of each. With one of the cube roots, in terms of its real and imaginary parts, cube by direct multiplication and show that the result is unity. 6. Find real and imaginary parts of V3~+5t, jT+lBi y/ A + Bi ^^ A, B are real. . 7. Show that In (-a) = id + In a, or &rt + In a, or m general nm + In a, where n is an odd integer. 8. Prove that if we have a complex solution of the problem of a vibrating particle, the real part of this complex function is itself a solution of the problem. , 9. Show that in general a linear homogeneous differential equation of the nth order with constant coefficients has n independent exponential solutions of the sort we have considered. . 10. Show that if we have n independent solutions of an nth order differen- tial equation, then an arbitrary linear combination of these solutions, con- taining n coefficients, is a general solution of the equation. CHAPTER IV DAMPED VIBRATIONS, FORCED VIBRATIONS, AND RESONANCE We have now reached the point where we can discuss a wide range of prbblems in oscillatory mechanical or electrical systems. The general question we shall take up is that of a system con- taining inertia, damping force proportional to the velocity, and restoring f oroe proportional to the displacement, under the action of an impressed force. This leads to an inhomogeneous second-order linear differential equation, of the form m|f + 2mk < ^ + m^x = F(t), (1) where the coefficients 2mk and wco 2 of the damping and restoring force terms, respectively, are written in this way to obtain a simple result. The term F(t) , which makes the equation inhomo- geneous, is the impressed force, a function of time. The solution of such an inhomogeneous equation, as we have seen, can be written as a sum of two parts. One is a particular solution of the problem, the so-called forced motion, a steady-state solution which persists as long as the force is applied. The other is the transient term, a general solution of the corresponding homogeneous equation obtained by setting F = 0. This transient proves to be a damped simple harmonic motion, an oscillation whose amplitude decreases exponentially with time, soon passing away, and leaving only the steady-state solution. The amplitude and phase of the transient are determined so that the whole motion will have the correct initial displacement and velocity, its two arbitrary constants being chosen to fit the initial conditions. 18. Damped Vibrational Motion. â€” We first consider the transient motion, whose equation is obtained from (1) above by setting F = 0. In the preceding chapter we have seen that the solution can be written x = e-^iAe^*^ 1 + Be-^^ 1 ). (2) There are three cases: (1) k 2 - Â« 2 < 0; (2) k 2 - Â« 2 = 0; (3) k 2 - 37 28 INTRODUCTION TO THEORETICAL PHYSICS Â« 2 > 0. The first is the case where the damping is small. Here y/k 2 â€” o> 2 = i\/oi 2 â€” k 2 , and the radical is real. Then we have the same sort of expression we have considered before, and to get a real answer we must write B = A, or else we can take the real part of a complex quantity. Let us do the latter: the solu- tion is the real part of Ae^e 1 *^" 2 -^ 1 , or is Ce~ kt cos (a/Â« 2 â€” k 2 t â€” a). This is like a simple harmonic motion, of angular velocity \Ao 2 â€” A; 2 , phase a, but with an amplitude Ce~ kt which continually decreases with time, and it is called damped simple harmonic motion. For snjall damping, the angular velocity can be k 2 expanded in power series, and is <o â€” ^- â€¢ â€¢ â€¢ , differing from <a by a small quantity of the second order. Thus, for example, a pendulum which is slightly damped will have its period only very slightly altered by the damping. The amplitudes of successive swings go down in exponential fashion, on account of the factor e~~ kt . Thus the logarithms of the amplitudes go down linearly with the time, and as a result this kind of damping is known as logarithmic damping. The decrease in the logarithm of the amplitude in a period is known as the logarithmic decrement. The other extreme case is the third, where k 2 â€” w 2 > 0, and there is nothing complex about the solution at all. It simply consists of two exponential terms, with only real coefficients. The resulting motion is not oscillatory, but merely damps down gradually to zero. The limiting case, k 2 â€” Â« 2 = 0, is called the critical case, and is most easily discussed as the limit of either of the others. An interesting practical application of all the cases is found in the problem of the vibrations of galvanom- eters. A galvanometer without damping oscillates back and forth with simple harmonic motion. With slight damping, it has nearly the same frequency, but a logarithmic decrement. As the damping is made greater and greater, the period gets larger and larger, until finally at critical damping and beyond there are no oscillations at all. The galvanometer, if displaced, simply settles slowly back to its normal position. 19. Damped Electrical Oscillations. â€” The corresponding electrical problem is given by the circuit containing resistance, inductance, and capacity, and the equation is DAMPED VIBRATIONS AND FORCED VIBRATIONS 29 The solution is q = c e -(*/2i>< C o S ( u t - a), (4) where co = Vl/LC - R 2 /AL 2 . This is the same solution which we found in Prob. 3 of the last chapter by the series method. It is an interesting illustration of the simplicity of the exponential method of solving the equa- tion. As we see, the current oscillates with an angular velocity which, for small R, differs only slightly from the undamped angular velocity \/l/LC, but it has a logarithmic damping, which is greater the greater R is. 20. Initial Conditions for Transients. â€” To fix the two arbitrary constants of the transient, we must fit the initial displacement and velocity. Thus, for instance, consider the solution in the form x = Ce~ kt cos (Vw 2 - k 2 1 - a). Assume that at t = 0, x = z n , and dx/dt = v . From the first, Xo = C cos a. (5) To apply the second, we have ^ = -Ce- kt V<Â» 2 - k 2 sin (\A> 2 - k 2 1 - a) -kCe~ kt cos (VÂ« 2 - k 2 t - a). Thus Vo = Cy/o> 2 â€” k 2 sin a â€” kC cos a. (6) By simultaneous solution of Eqs. (5) and (6) we can find C and a in terms of x and v . 21. Forced Vibrations and Resonance. â€” Our next task is to find a particular solution of Eq. (1) containing the external applied force. To do this, we shall first solve the case where the force is a sinusoidal function of the time, a very important special case. This leads to* a solution also sinusoidal with the same frequency, with an amplitude proportional to the amplitude of the force, but for which the constant of proportionality depends on the frequency, becoming large out of all proportion if the impressed frequency is nearly equal to the natural frequency. This phenomenon of enormously exaggerated response of the 30 INTRODUCTION TO THEORETICAL PHYSICS oscillating system to a certain impressed frequency is called resonance; it is of great physical importance. Familiar examples of resonance will occur to one. In mechanics, it is well known that a pendulum can be set swing- ing with large oscillations if it receives small periodic impulses, timed to synchronize with its own period, whereas any other impressed frequency would soon get out of step with the oscil- lations it sets up, and would force them to die down again. Acoustical resonance is illustrated by the way in which one vibrating tuning fork will set another into vibration if both have the same pitch, but not otherwise. Another acoustical example comes from Helmholtz's resonators: air chambers vibrating with a definite pitch, which are set into resonant vibration if sound of that particular pitch falls on them, but not appreciably by any other pitch, so that they can be used to pick out a particu- lar note in a complicated sound and estimate its intensity. The resonance of electric circuits is illustrated in the tuned circuits of the radio, which respond only to sending stations of a particular wave length, and practically not at all to other stations. In optics, the theory of refractive index and absorp- tion coefficient is closely connected with resonance. As is shown by the sharp spectrum lines, atoms contain oscillators capable of damped simple harmonic motion, or at any rate act as if they did; the real theory, using wave mechanics, is com- plicated but leads essentially to this result. An external light wave is a sinusoidal impressed force, leading to a forced motion of the oscillators with the same frequency but different phase. The component of motion in phase with the field reacts back on the field to change its phase, and this progressive change of phase as the light travels through the body is interpreted as a changed velocity of propagation, or an index of refraction differ- ent from unity. Similarly the other component produces a diminution of intensity, or absorption. The phenomenon of anomalous dispersion, with abnormally large index of refraction and absorption coefficient, comes about when the external wave is in resonance with the atom. 22. Mechanical Resonance. â€” Let the external force be F cos cot. It is simpler to regard this as being the real part of F eâ„¢ 1 . Thus we use the differential equation m ~dfi + 2mk 'dt + mo> Â° 2x = F Â° eia "> W DAMPED VIBRATIONS AND FORCED VIBRATIONS 31 where we use co for the natural angular frequency, to distinguish from the impressed angular velocity co. The resulting x will be complex, and its real part represents the actual motion. Now we assume that the forced motion has the same frequency as the impressed force, or that x = Ae iwt , where A may be complex. If A r and At are the real and imaginary parts of A, we easily see that the real part of x is given by A r cos (at â€” Ai sin at, (8) so that in general* the motion has one term in phase with the force, whose amplitude is given by the real part of A, and another out of phase, the amplitude being the negative of the imaginary part. Substituting our exponential formula for x in Eq. (7), we have [ra( â€” co 2 ) + 2mk(io)) + m(a Q 2 ]Ae iwt = Foe 1 "'. Canceling the exponential, we have A =ll I (Q\ m (o> 2 - co 2 ) + 2ika' W To get the coefficients of terms in phase and out of phase with the force, or A r and â€” Ai, we multiply numerator and denomina- tor by the conjugate of the denominator, obtaining respectively A r = '- COq 2 â€” CO 2 to (co 2 - co 2 ) 2 + 4A; 2 co 2 and _ a __Fq 2/bco 1 to (coo 2 - co 2 ) 2 + 4A; 2 co 2 " UUj These two functions are plotted in Fig. 4. It is seen that the first has the form made familiar by the anomalous dispersion curve in optics, the second resembling the corresponding absorp- tion curve. This resemblance is an essential one, as we shall see in Chap. XXIV. One feature of the curves should be mentioned. The anomalous behavior in the neighborhood of coo is confined to a narrower and narrower band of frequencies as & becomes smaller and smaller compared with co , so that if the damping is very small the resonance is very sharp, while if there is large damping, there is a broad range of frequencies over which resonance is appreciable. 23. Electrical Resonance. â€” Suppose that a dynamo supplies sinusoidally alternating electromotive force, given by E cos wt, 32 INTRODUCTION TO THEORETICAL PHYSICS to an electric circuit containing resistance, inductance, and capacity. The differential equation for the charge is then d 2 q jjdq df 2 + R m + i E COS wt. (11). We set up instead the differencial equation for the current i = Fig 4. â€” Amplitude of forced motion of an oscillator, as function of frequency. (a) Component in phase with force; (6) component out of phase. dq/dt, which we obtain from Eq. (11) by differentiating with respect to time : T dH , ^di , i d /T1 ,. (12) As with the mechanical case, we replace E cos cot by the complex exponential Ee iut , of which the real part gives the electromotive force. Similarly we assume the current to be sinusoidal, given by the real part of i e iut . Making these changes in Eq. (12), and carrying out the differentiations, we have .(â– E 5/ = ioiEe 1 ^o = R + i(U> - 1/Cw) The denominator here equals Ze ia , where Z = V# 2 + X 2 , X = Lo cJ (13) (14) and a â€” tan" X W DAMPED VIBRATIONS AND FORCED VIBRATIONS 33 where X is called the reactance, Z the impedance. Then the current is E i = y COS (ait â€” a). (15) The impedance takes the place of the resistance in problems involving alternating currents, since we divide the amplitude of the e.m.f. by the impedance rather than by the resistance to get the amplitude of the current. We note that the impedance is a function of frequency. It becomes infinite when the frequency becomes zero, on account of the term involving the capacity, and showing that a direct current cannot go through a con- denser; and also when the frequency becomes infinite, on account of the term in the inductance, showing that infinitely rapid oscillations cannot pass through the inductance. In between, it goes through a minimum, at the frequency for which X = 0, or o) = 1/y/LC, the natural frequency at which the circuit would oscillate by itself if it had no resistance or impressed e.m.f. Thus for impressed e.m.fs. of the same amplitude, but of a variety of frequencies, that whose frequency agrees most closely with the natural frequency will produce the largest cur- rent, and the others may produce much smaller currents, so that we have resonance, or tuning. To tune a circuit, one adjusts L or C, or both. When it is tuned, the sharpness of tuning depends on the size of R. For instance, if R were 0, there would be infinite response at exact resonance, so that the tuning would be infinitely sharp. In addition to the dependence of amplitude on frequency, there is also a phase difference between e.m.f. and current, given by the quantity a above. We can get a simple interpretation of this in the complex plane. The quantity R + iX is called the complex impedance. Its magnitude is just the real imped- ance Z, and its phase, or angle, is the angle a. It is interesting to note that a goes from â€”90 deg. at zero frequency to +90 deg. at infinite frequency, passing through zero at resonance. 24. Superposition of Transient and Forced Motion. â€” The general solution of an oscillatory problem is the sum of the steady- state motion (the particular solution), and a transient with arbitrary amplitude and phase, chosen to satisfy the initial con- ditions. Thus, choosing an electrical case, we may have no charge and current in a circuit at t = 0, but start applying a 34 INTRODUCTION TO THEORETICAL PHYSICS sinusoidal e.m.f. at that instant. The charge and current at any later time are given by -â€”* E q = Ae 2L cos (a t â€” a ) +"â€” y sm M â€” <*)> --Â«' R ---' i = â€” Aa Q e 2L sin (a Q t â€” a ) â€” -^orf cos ( w <>Â£ â€” a ) -f- -y COS (<oÂ£ â€” a), where co is the natural angular frequency, A and a the amplitude and phase of the transient. Then to determine A and a we have the equations El â€” q = A cos a ^ sin a = i . = Aa>o sin a â€” A^y cos a + -g cos, a, (16) where g , *o are initial charge and current, equal to zero for these particular initial conditions. Three examples of the charge as a function of time are given in Fig. 5. In (a), the natural frequency is taken to be much greater than the external frequency, and the logarithmic decre- ment large, so that the transient is a rapidly damped high frequency vibration, which is imperceptible after a few periods of the external force. The case (6) is that in which external and natural frequencies are almost equal, and the damping small. In this case, the forced and transient vibrations, having almost the same frequencies, form beats with each other, as one always has when two almost equal frequencies are superposed, the sum of two sine waves leading to a sinusoidal vibration whose fre- quency is the average of the two frequencies, but whose amplitude is modulated with the slow difference frequency between the two vibrations, as given by the equation cos ait + cos a4 = 2 cos ( - â€” 2 â€” f cos I 2 â€” / ^ ' Since the transient gradually dies down, however, the amplitude of the beats grows less and less, until gradually only the forced motion remains. In the case (c), the external frequency is exactly equal to the natural frequency. Here there are no beats, DAMPED VIBRATIONS AND FORCED VIBRATIONS 35 the amplitude merely building up exponentially to its final value. Curve A is forced motion, B transient, C combined motion, (a) Natural frequency high, impressed frequency low, large damping. (6) Impressed and natural frequency approximately equal. (c) Impressed and natural frequency equal. Fig. 5.- â€” Transient and forced motion superposed. 25. Motion under General External Forces. â€” If we are given an arbitrary external force, say F(t), we shall show in a later chapter that it is possible to write it as a sum of sinusoidal terms : F(t) = real part of Vf,^*' 36 INTRODUCTION TO THEORETICAL PHYSICS Thus any sound may be considered as made up of a superposition of pure tones, and any light as a superposition of pure colors. Now suppose we find the forced motion resulting from each of these sinusoidal vibrations acting separately, and then add them. The result will be the solution of the whole problem. For suppose x n (t) is the solution of the problem whose force is the nth term of the summation, so that we have d 2 d \ Add all these equations. Then we have n n showing that ]Â£\râ€ž satisfies the whole equation. We readily n ^ see that this is a special case of a general theorem : if the impressed force, in an inhomogeneous linear equation, is written as a sum of terms, and if we have solutions of the separate problems in which only one term of the sum is impressed at a time, the solu- tion of the whole problem is the sum of these separate solutions. We note that those particular forces whose frequencies are near the natural frequency will produce greatly exaggerated responses. 26. Generalizations Regarding Linear Differential Equations. We have made several generalizations regarding linear differential equations, and it is well to group these together. We have seen that 1. Any linear combination of solutions of a homogeneous linear differential equation is itself a solution, and if the linear combina- tion contains as many arbitrary constants as the order of the differential equation, it is a general solution. 2. A general solution of an inhomogeneous linear differential equation is the sum of a particular solution, and a general solu- tion of the corresponding homogeneous equation. 3. If the inhomogeneous part of an inhomogeneous linear differential equation is a sum of terms, and if we have the solu- tions of the equations formed by taking just one of these sepa- rately, the particular solution of the whole problem can be formed by adding these separate solutions. Physically, the first statement means that free vibrations of a system governed by a linear differential equation may be super- DAMPED VIBRATIONS AND FORCED VIBRATIONS 37 posed without affecting each other. The second means that free vibrations can coexist with forced vibrations; and the last, that forced vibrations from different sources can coexist without affecting each other. All these properties of coexistence or superposability of vibrations are characteristic only of linear equations, but, as we shall see, a great many physical phenomena are governed by such equations, so that the superposability of vibrations is of widespread physical importance. Problems 1. A coil of resistance 2 ohms, inductance 10 millihenries, is connected to a condenser of capacity 10 mf. At t = 0, the condenser is charged to a potential of 100 volts, and no current is flowing. Find the charge on the condenser at any later time, and also the current flowing. What are the period and logarithmic decrement of the circuit? What would the resist- ance have to be, leaving inductance and capacity the same, such that the system would be critically damped? 2. Prove that the displacement of a particle in damped oscillation is given by x = e- kt (x cos V" 2 -kH + Vo . + kxo sin -\A> 2 - ** t), \ Vw 2 - fc 2 / where xo, vo are initial values of displacement and velocity. Pass to the case of critical damping, by letting w 2 â€” k 2 approach zero. Show that the resulting motion has one term of the form te~ kt , and prove directly that this satisfies the differential equation. 3. Letting w = k/2, draw curves for x as a function of t, representing the damped motion for the case where the initial velocity is zero but the initial displacement is not, and also for the case where the initial displacement is zero but the velocity is not. 4. A pendulum is damped so that its amplitude falls to half its value in 1 min. Its actual period is 2 sec. Find the change in the period which there would be if the damping were not present. (Hint: use power series expansion for frequency, treating A; as a small quantity.) 5. A radio receiving station has a circuit tuned to a wave length of 500 m. It is desired to have the tuning sharp enough so that a frequency differing from this by 10,000 cycles per second gives only 1 per cent as much response as the natural frequency, for the same amplitude of signal. Work out reasonable values of resistance, inductance, and capacity to accomplish this. 6. The sharpness of tuning of a vibrating system may be measured by the so-called half breadth of the resonance band, or the frequency difference between the two frequencies for which the amplitude of response is half that at exact resonance. Prove that the ratio of half breadth to resonance frequency is proportional to the logarithmic decrement, if the damping is not too great. 7. A tuning fork of pitch C (256 vibrations per second) is so slightly damped that its amplitude after 10 sec. is 10 per cent of the original ampli- tude. It is set into oscillation, first by another fork of the same pitch, thee 38 INTRODUCTION TO THEORETICAL PHYSICS by one a semitone higher, both vibrating with the same amplitude. Find the ratio of amplitudes of forced motion in the two cases. What will be the pitch of the forced vibration in the second case? 8. The support of a simple pendulum moves horizontally back and forth with simple harmonic motion. Show that this sets the pendulum into forced motion, as if there were a force applied directly to the bob. Show that the motion has the following behavior: The pendulum pivots about a point not its point of support, but such that, if it were really pivoted here, its natural period would be the actual period of the forced motion. Discuss the cases where the pivotal point is below the point of support; above the point of support. Neglect transients. 9. A particle subject to a linear restoring force and a viscous damping is acted on by a periodic force whose frequency differs from the natural fre- quency by a small quantity. The particle starts from rest at t = 0, and builds up the motion. Discuss the whole problem, including initial condi- tions. Consider what happens in the limiting case when the frequency gets nearer and nearer the natural frequency, and the damping gets smaller and smaller. Show that the results are as indicated in Fig. 5, (6), (c). 10. The amplitude of the forced current in a circuit is . _ E u ~ [R + i(L<* - 1/CÂ«)]' Plot real part as abscissa, imaginary part as ordinate, obtaining a curve by taking points for all frequencies. Find the equation of the resulting curve, and prove that it is a circle. 11. Show that for a particle subject to a linear restoring force and viscous damping the maximum amplitude occurs when the applied frequency is less than the natural frequency. Find this resonance frequency. Show that maximum energy is attained when the applied frequency equals the natural frequency. What are the maximum amplitude and maximum energy? 12. The motion of an anharmonic undamped oscillator is described by m Jtz ~*~ mo> Â° 2x ~l~ ^ xi = Â®' where 6 is a small quantity. Solve this equation by successive approxima- tions, expanding x in a power series in powers of b. 13. If the oscillator in Problem 12 is acted on by a force A cos pt + B cos qt, show that the steady-state solution contains terms of frequencies 2p, 2?, q + P, q â€” P, 2g + p, 2q â€” p, etc. Note that superposition does not hold for the equation above. These new frequencies are called combination tones. CHAPTER V ENERGY We have progressed far enough in our study of mechanics so that it will pay to stop and survey the situation. Mechanics is a large subject, and we may consider some Of the directions in which we could extend what we have done already. In the first place, we may treat the mechanics of many sorts of systems. We may have the mechanics of particles, or of rigid bodies, or of deformable, elastic solids, or of fluid media. All these we shall treat, in more or less detail, before we are through. What we have done so far comes under the heading of mechanics of parti- cles, and we shall look at that field in more detail. In the first place, one almost never has real particles to deal with in a mechanical problem. Probably the closest approach is found in the kinetic theory of monatomic gases, where the atoms act like movable points exerting forces on each other. But often very large bodies can act as particles, as, for instance, the planets in their motions about the sun. Then again we can have essentially complicated systems, like pendulums, or weights suspended on springs, which yet have such simple motions that we can apply the methods of the mechanics of particles to them. Many of the problems we have treated so far have been of this sort. A particle has three coordinates, which may be x, y, z, and the problem of mechanics is to find the way in which these coordinates change with time. The starting point is Newton's second law of motion, giving the accelerations, or second time derivatives of the coordinates, in terms of the forces; All of our problems so far, whether dealing with actual particles or not, fall under this classification, and in fact belong to the more restricted class of one-dimensional problems, with but one coordinate x. The next few chapters will be devoted to the two- and three-dimensional cases of mechanics of a particle. The one-dimensional motions of a particle fall into different classes, depending on the type of force acting. We have treated several sorts of forces: viscous resistances, linear restoring forces, 39 40 INTRODUCTION TO THEORETICAL PHYSICS external forces which are arbitrary functions of time. That is, the force may be a function of velocity, of position, or of time, or, of course, of all three combined. Most common mechanical problems are of this type, the force depending on v, x, and t, but this is not necessary. For instance, in radiation problems, in electromagnetic theory, one meets a force proportional to the time derivative of acceleration, or to d 3 x/dt 3 , which turns out to act much like a viscous resistance. But such cases are rare. The simplest cases are those in which the force depends only on the coordinate. Then, in one-dimensional motion, we can always introduce a potential energy, which added to the kinetic energy gives a total energy that stays constant, expressing the conservation of energy. If, on the other hand, there are external impressed forces, the energy may increase or decrease with time, depending on whether the impressed forces do work on the system or have work done on them; while, if there are frictional forces, the energy will decrease with time, being dissipated in heat, for which reason these forces are called dissipative forces. It is plain that the study of different types of forces is closely tied up with the idea of energy, which we so far have not discussed, and we turn to this question, first deriving the mathematical formula- tion of kinetic energy for one-dimensional problems. 27. Mechanical Energy. â€” Let us see where the concept of energy comes from, and how we can use it. We start with a particle of mass to, acted on by a force F. Then Newton's second law is md 2 x/dt 2 = F. Now let us multiply each side by dx/dt, and integrate with respect to t, from time t up to t: f l dxd 2 x 7 , C 1 ^ dx , Both these integrals can be transformed. First, we note that d/dx\ _ 9 dt\dt) dt~dt 2 ' Thus the left side is to C l d(dxV_. _ m(dx\ 2 \ l 2j t() dt\~di) dt ~ 2\dt) | ( ; or letting dx/dt be denoted by v, and its value at t = t by v , this side is mv 2 /2 - mv 2 /2. On the right, j F dx/dt dt = / F dx, where ENERGY 41 now the integral is from x to x, if x is the value of x at t = t , x at t. Then the equation is 2 mV 2 wwo 2 = j F dx. , (1) * Â»/a;o The quantity wv 2 / 2 is called the kinetic energy, JF dx is the work done, and our equation says that the work done by the force on the particle between two instants of time equals the increase in kinetic energy during the time. This is the fundamental propo- sition relating to energy, and our proof is the standard one. Next we consider the nature of the force F. First there is the case where it depends only on the position of the particle, as in a gravitational field or a linear restoring force, without friction. Then F = F(x), and we may write f Fix) dx = - Vix), so that mv 2 /2 + V(x) = mv^/2 + V(x Q ). The quantity V(x) is called the potential energy, and the sum of it and the kinetic energy is the total energy; our equation states that the total energy remains constant during the motion. The lower limit x of integration may be chosen in an arbitrary way, or an arbi- trary constant of integration may be added to the potential energy, without changing the results, which depend only on potential differences. The potential energy is related to the force either by the equation above, or by its derivative, F = -dV/dx. In case the force depends on the velocity as well as the position, the situation is quite different. Then the value of F cannot be predicted when x is known, so that we cannot even evaluate the work done without knowing more details about the system. In such a case it is plainly impossible to set up a potential energy function independent of time, or to speak of the total energy being conserved. Such a system is called nonconservative, in contrast to a conservative system in which the energy stays constant. Even in a nonconservative system it is often possible to write a potential function connected with part of the force. Thus with a damped oscillator, we can write a potential function for the restoring force, but not for the viscous resistance. In such a case we shall still speak of the sum of the kinetic and potential energy as being the total energy, but we can no longer say that it remains constant. Rather we should say that the time rate of change of the energy was equal to the rate of working 42 INTRODUCTION TO THEORETICAL PHYSICS of outside forces, both of viscosity and of any external impressed forces, on the system. Let us see what this means mathe- matically. Let F = -dV/dx + G, where V is the potential function for that part of the force derivable from a potential, and G is the remaining force. Then the energy is rav 2 /2 + V. The time rate of change of the energy is K^) + i FWI -("Â» L+ Â© ,; using ^ = ^>|. Â»*(.Â£ + Â£) -*, b, Newton, second law, so that the time derivative of energy reduces to Gv, or the external force times the velocity, as we should expect. One should not be disturbed to find systems whose total energy does not stay constant. At first sight they seem to contradict the general law of conservation of energy, but on closer examina- tion we always find that they are parts of a larger system whose energy really is conserved. Thus if we consider not merely the damped vibrating particle but also the viscous fluid doing the damping, we shall find that the latter gains the energy lost by the former, transforming it into heat, itself a form of energy. It is in fact a general situation that there are two ways of treating a mechanical problem: first, by considering the whole system, and treating it as a conservative system; second, treating only part of the system, and taking the forces exerted by the rest on this part as being impressed or dissipative forces, which cannot be derived from a potential. 28. Use of the Potential for Discussing the Motion of a System. The one-dimensional motion of a particle in a conservative field can be discussed with great ease by the use of the potential function. Suppose we know V as a function of x, and suppose that we inquire about the motion of a particle of total energy E in this potenti al field. Then we have mv 2 /2 = E - V, v = â– y/2{E â€” V)/m. Since this is a known function of x, we can find the speed at every point. In the first place, we can use this to get an explicit solution of the problem. For writing v = dx/dt, and integrating, we have dx V2(E - V)/m ' { } giving a relation between t and x, involving two arbitrary con- ENERGY 43 stants (energy, and the constant of integration, determining the origin of time, or the phase). Thus for instance for a particle moving under gravity, where V = mgx, we have dx *,-/: X, V2E/m - 2gx Letting z = 2E/m â€” 2gx, so that dz = â€”2gdx, this is 1 C dz Wz i , 2gJ Vz g g y where evidently t is the value of t at which 2E/m â€” 2gx = 0, or x â€” E/mg, which, as we readily see, is the highest point of the path, at which the body commences to fall. If we let this value of x be xo, then, squaring, we have x â€” x = â€”jg(t~ t ) 2 , the familiar solution. Many one-dimensional problems can be solved by this method, as for instance the pendulum with large amplitude, which leads to an elliptic integral. On the other hand, there are, of course, many cases where the integration is too difficult to carry out. Even if the solutions cannot be obtained exactly, however, we can still use the. method of the energy to get general informa- tion about the problem. Let us imagine V plotted as a function of x (see Fig. 6) . Then we draw on the same graph a horizontal line at height E. The square root of the difference between the two curves is then proportional to the velocity of the particle at that point. Thus the velocity is only real where this difference is positive, and is imaginary elsewhere. If the velocity is only real in certain regions of x, this means that the motion can only occur within those regions. As the particle approaches thg edge of such a region, the speed gets smaller and smaller, and finally at the edge the particle stops. Then it reverses and travels away again. The possibility of going either toward or" away from the boundary comes from the two signs of the square root : the velocity at a given point of space is always the same in magnitude, but can be in either direction. If now the region where the kinetic energy is positive is bounded at both ends, then after reversing its motion at one edge, the particle will travel to the other, reverse, come back, and repeat the process indefi- nitely. Since at a given point the particle always travels, with the same speed, it will always require the same time to traverse its path, and the motion will be periodic. Thus, if the total 44 INTRODUCTION TO THEORETICAL PHYSICS energy is Ei (Fig. 6), the motion is periodic, confined between c and e. If it is E 2 , either of two periodic motions is possible, between b and /, or between h and j. This is a general result for a conservative motion in one dimension which does not extend to infinity. If, on the other hand, the kinetic energy remains positive in one direction all the way to infinity, but becomes negative at a finite point in the other direction, the particle will come in from Fig. 6. â€” Potential energy V as function of coordinate x. Total energy E\, periodic motion between c and e. Total energy Ei, periodic motion between b and /, or h and j. Total energy Ei, nonperiodic motion between a and infinity, reversing at a. Total energy Ei, nonperiodic nonreversing motion. infinity, having taken since negatively infinite time to do it, will reverse, and will return to infinity. This is the case with energy E s , the particle coming in from the right, speeding up in the region about i, slowing down about g, speeding up about d, finally coming to rest at a, and reversing, traveling back to the right. An example of the first, periodic case is a particle vibrat- ing in simple harmonic motion, and of the second nonperiodic case is a ball coming from infinity, hitting a wall, and being bounced back again, or a ball thrown up in the air and coming down again. Finally there is the possibility of a potential such that the kinetic energy is positive at all values of x. Then the particle persists in one direction forever, like a free particle, but generally travels with a variable velocity. Such a case is found for energy E A , where the particle starts from infinite distance in one direction, travels toward the center, speeds up and slows ENERGY 45 down corresponding to the maxima and minima, and finally goes to infinity in the other direction. It is to be noted that motions in the same potential field, but with different total energy, can have quite different characteristics under this classification. Thus oscillatory motions are always possible around minima in the potential energy, for small enough total energy. But it may be that for too high total energy the particle will be able to get entirely away from the neighborhood of the minimum, and will go to infinity. In Fig. 6, there are three points, d, g, and i, at which the force is zero, and the particle would stay at rest forever, if it were placed at one of these points. Of these, g is a position of unstable equilibrium, and a small impact would start the particle oscillating, about either d or i. On the other hand, d and i are both points of stable equilibrium, so that a particle at rest at either of these points would suffer only small oscillations about that point if struck a small impact. 29. The Rolling-ball Analogy. â€” A simple model which shows the properties of one-dimensional motion can be set up as follows. We imagine a track, like a roller coaster, set up, shaped just like the potential curve. Then we start a ball rolling on this track, starting from rest at a given height. Its motion will then approximate that of a particle in the corresponding potential field. The reason is. that, since gravitational potential is pro- portional to height, the ball actually has the potential at any point which it should, and correspondingly the correct speed. The only approximations made, other than friction, consist in neglecting the fact that part of the kinetic energy actually goes into up and down motion, and part into rotation, instead of all into horizontal motion. From such a model, we can see how motion may be oscillatory, if the track rises on the far side of a dip up to the height where the ball started, or how it can go to infinity if the track continues permanently at a lower level. We can also see the general character of the solution in the case where there is damping, just by imagining that the ball is subject to friction. Obviously the motion still will have the character of the undamped motion, but corresponding to a continually decreasing energy. Thus with an oscillatory motion the ampli- tude will constantly decrease until it stops, while with a motion which originally was not oscillatory it may be possible that it become trapped in a minimum of potential, settle down to oscil- late*, and eventually come to rest. In any case, if the damping 46 INTRODUCTION TO THEORETICAL PHYSICS continues, the motion will eventually stop, at a minimum of potential. 30. Motion in Several Dimensions. â€” So far, we have treated only the motion of a particle in one dimension. If it can move about in two- or three-dimensional space, however, the problem becomes much more difficult. Suppose the coordinates of a particle are x, y, z, so that its motion is described by finding x, y, and z as functions of time. Force, acceleration, are vectors, and our first task is to investigate vector analysis enough to deal with these quantities. We shall find that in two and three dimensions it is by no means true that all force fields, in which the force is a function of the position alone, can be derived from a potential function. The next chapters, then, will deal with vectors, force fields, and potentials. When we come to the equations of motion, we find separate equations for each com- ponent : if F x , F y , F z represent the components of force along the axes, we have m % - "â– ' m ^ = F " m % - F " (3) a set of simultaneous differential equations. Such equations can be solved in a few simple cases. For instance, if F x depends only on x, F y only on y, F z only on z, they are simply three independent equations, which we can handle by the methods already used. This is called the method of separation of vari- ables, and much of our effort will be directed toward this method of solution. We shall carry out methods of changing to arbitrary coordinate systems, with a view to separating variables. For instance, in motion under a force acting toward a center of attrac- tion, we introduce polar coordinates, and in these the equation for r is separated from those for the angles, so that we can solve. The process of changing coordinate systems leads us to Lagrange's equations, the equations of motion in generalized coordinates. Finally, the method of energy, the rolling-ball analogy, and the other methods of the present chapter, can be used in several dimensions, and provide the best means for a qualitative discus- sion of a problem. Problems 1. Take the sinusoidal solution for the displacement of a harmonic oscilla- tor, find the velocity from it, compute kinetic and potential energy as func- ENERGY 47 tions of time, and add to show that the sum remains constant. Show that the energy is proportional to the square of the amplitude. 2. Proceed as in Prob. 1, but for the damped oscillator, rinding the sum of kinetic and potential energies, showing that it decreases with time. Com- pute the time rate of change of the energy, find the rate of working of the frictional force, and show by direct comparison that they are equal. 3. Let a particle move in a field whose potential is â€” 1/x + 1/x 2 . Show by graphical methods that for small total energy the motion is oscillatory, but that for larger energy it is nonperiodic and extends to infinity. Find the energy which forms the dividing line between these two cases. Compute the limiting frequency of the oscillatory motion as the amplitude gets smaller and smaller (using the results of Prob. 1, Chap. I), and describe qualitatively how the frequency changes when the amplitude increases. 4. Solve directly the problem of the motion of a particle moving in a field of potential â€”l/x + 1/x 2 , using the energy integral. Show that the mathematical solution has the physical properties found in Prob. 3. 6. Using the solution of Prob. 4, find the period of oscillation of the oscil- latory solutions in the potential â€” l/x + 1/x 2 , as functions of the energy. To do this, note that the two ends of the path are the values of x for which â– \/2(E â€” V)/m = 0. Thus the integral I , from one of J*o V2(# - V)/m these points to the other will give just the half period. Show that the period approaches the value found in Prob. 3 for small oscillations. 6. In an electric circuit, show that one can set up a magnetic energy \U 2 analogous to a kinetic energy, and an electric energy \q 2 /C analogous to a potential energy. Show that the rate of change of this total energy equals the rate of working of the resistance and the applied electromotive force. 7. An atom acts like a particle held to a position of equilibrium by a definite restoring force and a viscous resistance. An external light wave exerts a sinusoidal force, the atom executing a forced vibration under the influence of the wave. Show that the atom continually absorbs energy from the wave, the energy going into the viscous resistance. Show that the rate of absorption is proportional to the component of amplitude out of phase with the force, which we have already connected with the absorption coefficient. 8. Solve the problem of the undamped oscillator, by using the equation t = / dx/s/2{E - V)/m. 9. Discuss the problem of the pendulum with arbitrary amplitude by the graphical method. Show that for low energies the motion is oscillatory, but for high energies it is a continuous rotation. Sketch the qualitative form of curves for angular displacement as a function of time, for several energies, in both the oscillatory and rotatory ranges. 10. Set up the problem of the pendulum by the method of Prob. 8, and show that t as a function of the angle is given by an elliptic integral. (Hint: Use the information about elliptic integrals given in Peirce's table; note that 1 - cos 6 = 2 sin 2 Â§0.) CHAPTER VI VECTOR FORCES AND POTENTIALS In our one-dimensional problems, we have had no occasion to mention vectors; however, before we can treat the detailed theory of motion in two or three dimensions, we must discuss them, and their relation to such things as potential energy. 31. Vectors and Their Components. â€” The force, in two- or three-dimensional motion, is a vector, and we must make a study of the mathematical relations of vectors. In the first place, a vector is often denoted by its components along three axes at right angles, as F x , F y , F z . Vectors, in the second place, obey the following law of addition: if two vectors F and G have components F x , F y , F z and G x , G v , G z , respectively, the components of the sum F + G are (F x + G x ), (F y + G y ), (F z + G z ). A graphical discussion shows that this is equivalent to the familiar parallelogram law of addition (as in Fig. 2, where the same proposition was shown for complex numbers, regarded as vectors) . Third, if we multiply a vector by a constant, as C, each compo- nent is multiplied by this constant. Thus the components of CF are CF X , CF y , CF Z . Often a constant like C is called a "scalar," to distinguish it from a vector. A scalar is a quantity which has magnitude but not direction, a vector having both magnitude and direction. It is often useful to write vectors in terms of three so-called unit vectors, i, j, k. Here, i is a vector of unit length, pointing along the x axis, and similarly j has unit length and points along the y axis, and k along the z axis. Now we can build up a vector F out of them, by forming the quantity iF x + jF y -f kF z . This is the sum of three vectors, one along each of the three axes, and the first, which is just the component of the whole vector along the x axis, is F x , and the other components likewise are F v and F z . Thus the final vector has the components F X) F y , F z , and is just the vector F. By the magnitude of a vector we mean its length. By the three-dimensional analogy to the Pythagorean theorem, by which the square on the diagonal of a rectangular prism is the sum 48 VECTOR FORCES AND POTENTIALS 49 of the squ ares on the thre e sides, the magnitude of a vector F equals -\/F x 2 + F y 2 + F z 2 . We often speak of unit vectors, i.e., vectors whose magnitude is 1. The component of a vector in a given direction is simply the projection of the vector along a line in that direction. It evi- dently equals the magnitude of the vector, times the cosine of the angle between the direction of the vector and the desired direction. As a special example, the component of a vector F along the x axis is F x , and this must equal the magnitude of F, times the cosine of the angle between F and x. If this angle is F called (F, x), then we must have cos (F. x) = -â€” , ' VF X 2 + F y 2 + F 2 with similar formulas for y and z components. The three cosines of the angles between a given direction, as the direction of the vector F, and the three axes, are called direction cosines, and are often denoted by letters I, m, n, so that in this case we have I = cos (F, x), etc. It follows immediately that I 2 + m 2 + n 2 = 1. We can make a simple interpretation of the direction cosines of any direction: they are the components of a unit vector in the desired direction, along the three coordinate axes. 32. Scalar Product of Two Vectors. â€” Multiplication of two vectors is a rather special process, and there are two entirely independent products, called the "scalar product" and the "vector product." We shall first consider the scalar product. The scalar product of two vectors F and G is denoted by F â€¢ G, and by definition it is a scalar, equal to either (1) the magnitude of F times the magnitude of G times the cosine of the angle between; or (2) the magnitude of F times the projection of G on F) or (3) the magnitude of G times the projection of F on G. From the last section we see that these definitions are equivalent. It is often useful to have the scalar product of two vectors in terms of the components along x, y, and z. We find this by writing in terms of i, j, and k. Thus we have F â€¢ G = (iF x + jF y + kF z ) â– (iG x + jG v + W.) = (i â– i)F x G x + (i â– j)F x G y + (i â€¢ k)F x G, + U â– i)FyG x + (j â– j)F v Gy + (j â– k)F y G z + (k â€¢ i)FJG x + (k â€¢ j)F z G y + (k â€¢ k)FjG z . But now by the fundamental definition, i â€¢ i = j â€¢ j = k â€¢ k = \, ij=j'i=j'k = k-j = k-i = i'k = 0. 50 INTRODUCTION TO THEORETICAL PHYSICS Thus F ' G = F X G X ~\~ FyGy -f- F Z G Z (1) Fig. The scalar product has many uses, principally in cases where we are interested in the projections of vectors. For example, the scalar product of a vector with a unit vector in a given direc- tion equals the projection of the vector in the desired direction. The scalar product of a vector with itself equals the square of its magnitude, and is often denoted by F 2 . The scalar product of two unit vectors gives the cosine of the angle between the directions of the two vectors. To prove that two vectors are at right angles, we need merely prove that their scalar product vanishes. 33. Vector Product of Two Vectors. The vector product of two vectors F and G is denoted by (F X G), and by definition it is a vector, at right angles to the plane of the two vec- tors, equal in magnitude to either. (1) the magnitude of F times the mag- nitude of G times the sine of the angle between them; or (2) the mag- nitude of F times the projection of Q on the plane normal to F; or (3) the magnitude of G times the projection of F on the plane normal to G. We must further specify the sense of the vector, whether it points up or down from the plane. This is shown in Fig. 7, where we see that F } G, and F X G have the same relations as the coordinates x, ?/, z in a right-handed system of coordinates. Another way to describe the rule in words is that, if one rotates F into G, the rotation is such that a right- handed screw turning in that direction would be driven along the direction of the vector product. From this rule, we note one interesting fact: if we interchange the order of the factors, we reverse the vector. Thus (F X G) = -(GXF). We can compute the vector product in terms of the compo- nents, much as we did with the scalar product. Thus we have F XG = {iF x +jF y + kF g ) X (iG x + jG y + kG z ) = (i X i)F x G x + (* X j)F x G y + (i X k)F x G z + (j X i)F v G x + (j X j)F y G y + (j X W + (k X i)FJG* + (k X i)F z G y + (k X k)Ffi 9 . 7. â€” Direction of the vector product. VECTOR FORCES AND POTENTIALS 51 But now, as we readily see from the definition, iXi=jXj = kXk = 0, (as, in fact, the vector product of any vector with itself is zero) ; and iXj = -U X i) = k,j Xk = -(k Xj) = i, kXi = -(iXk) = j. Hence, rearranging terms, we have FXG = i(F y G z - F z G y ) + j(F z G x - F X G Z ) + k{F x G y -F y G x ). (2) As an example of the use of the vector product, we may men-, tion the angular momentum vector. If we have, as in Chap. V, a particle of mass m, velocity v (a vector), and we wish its angular momentum about a certain center, we must take m times the magnitude of the radius vector times the projection of v at right angles to the radius. But this is just m times the magni- tude of the vector product of r and v. Further, this vector product is a vector pointing along the axis of rotation, and in a positive direction if the rotation is positive, or counterclockwise, so that it is just in the direction conventionally assigned to the angular momentum. Hence we have angular momentum = m(r X v). Another example of the use of the vector product comes fre- quently, when we may wish to prove two vectors to be parallel. To do this, we need only show that their vector product vanishes. 34. Vector Fields.-â€” Very often in physics one has vectors which are functions of position. There are two particularly common examples, a force field, and a velocity, or flux density, in a flowing fluid. In an electric or magnetic or gravitational field, for instance, the force on unit charge or pole or mass at any point of space is a vector, of components F x , F y , F z , varying from point to point in both direction and magnitude. Often such a vector field is indicated graphically by introducing lines tangent at every point to the vector at that point, called lines of fojfce or lines of flow, as the case may be. We shall discuss the nature of vector fields more in detail in connection with hydro- dynamics and the flow of fluids, in Chap. XVII. Our present application is to force fields, and our main interest is to discover in what cases the force vector can be derived from a potential 52 INTRODUCTION TO THEORETICAL PHYSICS function. To investigate this, let us consider the energy theorem in three dimensions, deriving the work done in an arbitrary displacement. 35. The Energy Theorem in Three Dimensions. â€” Let us start with the equations of motion of a particle in a force field, d 2 x â€ž m aÂ¥ = F Â« d 2 v mj - Fâ€ž m % = F,. (3) Multiplying by dx/dt, dy/dt, dz/dt, respectively, and integrating with respect to time, we have as in the last chapter \m v 2 x â€” %mv 2 x0 = jF x dx, \m v\ - \m v\ = jF y dy, \m v 2 z â€” \m v 2 zo = JF a dz. Adding, \m (v 2 x + v\.+ v 2 z ) - \m (v 2 x0 + v* v o + v 2 z0 ) = j(F x dx +F y dy + F z dz). (4) Now v x 2 + Vy 2 + v. 2 is the square of the magnitude of the velocity, or is v 2 . Thus the left side of Eq. (4) is the final kinetic energy of the particle minus the initial kinetic energy, so that the integral on the right should be the work done. The integrand is evidently a scalar product : the product of the vector F, and the infinitesi- mal displacement vector of components dx, dy, dz, which we may call ds. The scalar product, which is F â€¢ ds. is the displacement times the projection of the force in the direction of motion. This is what is ordinarily called the work done, since only the compo- nent of force along the motion does work. The integral is simply the sum of all the infinitesimal amounts of work done, or is the total work done, as in one-dimensional motion. 36. Line Integrals and Potential Energy. â€” The integral JF ' ds is called a line integral, for its evaluation demands the knowledge of a definite path between starting point and end point, as well as of the function F. In general this integral will depend on the path as well as the end points. For instance, suppose the lines of force went in circles, as in Fig. 8. Then the work done along the path ABC is positive, since the force and VECTOR FORCES AND POTENTIALS 53 displacement are parallel; along ADC the work is negative, since force and displacement are opposite; while along A EC it is zero, force and displacement being at right angles. Intermediate paths would yield any value we chose for the work done. Hence we surely could not define a potential, for the work done between A and C could not be set equal to the difference of potential in any unique way. In discussing one-dimensional motion, we saw that a potential depending only on position could not be introduced if the force depended on velocity, time, or anything except displace- ment. Here the condition is more stringent : we cannot have a potential, even if force depends only on posi- tion, unless the integral JF â€¢ ds is independent of path. If this condi- tion is satisfied, however, we can Set Up a potential energy V, SUCh Fig. 8â€” A nonconservative that -JF-d, from some standard {T^TVES/M point where the potential is zero, up along abc is positive, along to the point we are interested in, t!%XZÂ£%Jl*Â£Z equals V. Evidently another way of A and C is not independent of stating the criterion for existence of pat ' a potential is that the work done in taking a particle about any arbitrary closed path, or JF â€¢ ds where the integral is about a closed curve and back to the starting point, be zero. Still a third condition, easier to apply in actual cases, will be derived in a later section. 37. Force as Gradient of Potential. â€” Let us suppose that it is possible to set up a potential function V in a given case. We know how to write V as the negative line integral of F. Now we ask the opposite question: Given V, how do we find F? Let us suppose that we are at a given point of space, and that we allow the coordinates to increase by small amounts dx, dy, dz, forming a vector ds, while at the same time we exert a force â€” F to balance the force of the field. Then first, we shall do the work â€” F â€¢ ds on the system ; second, the potential will increase by the amount dV â€” V(x + dx, y + dy, z + dz) â€” V(x, y, z). These must be equal; and writing the scalar product as F s \ds\, where \ds\ is the magnitude of the displacement, F s the component of F parallel to the displacement, we have 54 INTRODUCTION TO THEORETICAL PHYSICS dV = -F-ds = -F.\ds\, F s = -S- (5) A derivative of the sort occurring in Eq. (5), where we take the difference of a scalar function like V at two neighboring points, divide by the magnitude of the displacement, and pass to the limit, is called a directional derivative, for evidently its value depends on the direction in which the displacement is made. We thus have the result that the component of force in any direction is the negative directional derivative of the potential in the desired direction. The x component of force is determined from the directional derivative of V along the x direction. To find that, we allow x to increase by dx, keeping y and z fixed; divide the difference V(x + dx, y, z) â€” V(x, y, z) by dx; and pass to the limit as dx becomes small. But this is simply the partial derivative of V with respect to x. We see, in other words, that a partial deriva- tive of a function is merely a special case of a directional deriva- tive, in which the direction is along one of the coordinate axes. Using this fact, we then have Fm = _*?, F, = -g, F, = _Â£. (6) dx dy dz ' The three partial derivatives in Eq. (6) are evidently the com- ponents of a vector, called the gradient of V, and abbreviated grad V. Thus Air dV _l ,- dV _l i, dV n\ grad V = ^^+J^ + k~ , (7) and we may write a vector equation F = -grad F. (8) 38. Equipotential Surfaces. â€” Let us take a displacement ds in a direction tangent to an equipotential surface, or surface on which V is constant. Then no work is done, so that dV = 0. But also F â€¢ ds . = 0. If this is so, then F and ds must be at right angles. Thus we have proved that the force, and hence the lines of force, are at right angles to the equipotential surfaces. Any scalar function of position can be described by a set of surfaces, like equipotentials, on which it is constant. We see then that the gradient of such a function is a vector, at right angles to the VECTOR FORCES AND POTENTIALS 55 equipotentials, measuring the rate of change of the function in this direction. The name gradient comes from contour maps in two dimensions. There the contours are lines of constant alti- tude, and the ordinary gradient of a slope is the rate of change of height with horizontal distance, in the direction at right angles to the contours, or the direction of steepest slope. In our case, the gradient points in the direction in which the function increases, while the force, being the negative gradient of the potential, points in the direction in which the potential decreases. 39. The Curl and the Condition for a Conservative System. â€” Let F x = -dV/dx, F y = -dV/dy. Differentiating the first with respect to y, the second with respect to x, we have dF x /dy = â€” dW/dydx, dFy/dx = â€” dW/dxdy. But by the fundamental theorem of partial differentiation, these two are equal, so that dF x /dy = dFy/dx. Similarly we have two other equations. These can be combined in a single vector equation. We shall find that it is useful to set up a vector called the curl, according to the definition (dF, _ dj\\ (dF_ x _dF J \ (dFy _ dF x \ \ dy dz) + \dz dx) + \ dx ~dy~)- (9) Then our three equations are combined in the one vector equa- tion curl F = 0. These form relations between the components of force, which plainly must be fulfilled if there is a potential. Yet it is by no means true that any set of forces will satisfy these conditions. The vanishing of the curl at all points of space, then, is a necessary condition which F must satisfy, if it is derivable from a potential. It can be proved that it is also a sufficient condition, so that it is the criterion which we desired, telling whether a potential can be set up in a given problem or not. As we shall see in a problem, the nonvanishing of the curl of a vector in general means whirlpool-like lines of force, as in Fig. 8. 40. The Symbolic Vector V. â€” We have seen two vector dif- ferential operators, the gradient and the curl. These can both be expressed conveniently in terms of a symbolic vector operator V, equal to (i d/dx + j d/dy + k d/dz). Of course, this operator by itself has no meaning, but its interpretation is that it is always to be followed by some other quantity, and the differentiations are to be performed on this quantity. Thus if we have a scalar V, the quantity VV is a vector, equal to curl F = i\ 56 INTRODUCTION TO THEORETICAL PHYSICS _ T/ (.b .b ,,d\ v .bV , .dV ,,dV grad 7. (1G) Similarly, if we have a vector j^, the vector product (V X F) is equal to <rx F )-(fr-l'-)+{i*--&) + {&â€¢- If) -*Â«**â– <Â»> In the course of time, we shall meet several other vector operations, which can be expressed in terms of V. We shall merely define them now, though we shall have many applications later. If we have a vector F, the scalar product of V with F, or (V â– F), is a scalar, evidently equal to This is called the divergence of F, abbreviated div F. Again, if we have a scalar 7, and take two factors V multiplied by 7, or (V â€¢ V) 7, the result is (4x + % + k Q â€¢ ( { i + 4 + k & v = \d:r 2 ^ dy 2 ^bz i ) bx 2 ^ by 2 ^ bz 2 V - Kl6) This is called the Laplacian of 7, and there is no usual abbrevia- tion, except V 2 7, which evidently is equivalent to the method of writing above. Clearly V 2 7 = div grad 7. Finally we can take the Laplacian of a vector: if F is a vector, V 2 F = /**= + *F? + ^A + /^ + **â– + ^ + \ dx 2 ^ dy 2 ^ bz 2 J^ \ dx 2 ^ by 2 ^ bz 2 J ^ k y A J + ~^J + b 2 F z , b 2 F z bx 2 by 2 Problems 1. Find the angle between the diagonal of a cube and one of the edges. (Hint: regard the diagonal as a vector i + j + h.) 2. Given a vector i + 2/ + 3A;, and a second i â€” 2j + ak, find a so that the two vectors are at right angles to each other. VECTOR FORCES AND POTENTIALS 57 3. Let F x = y,F v = -x,F z = 0. Prove that this vector field represents a force tangent to circles about the origin in the xy plane. Compute JF â– ds around such a circle. 4. Find the curl of the force in the preceding problem. Discuss the question as to whether it is a conservative field or not. 5. In the gravitational field of a mass m, the potential is given by â€”m/r, where r is the distance from the mass, given by r 2 = x 2 + y 2 + z 2 , if the mass is at the origin. Obtain the components of the force vector by direct differentiation. Find the curl of the force, and show that it is zero. 6. Find which ones of the following forces are derivable from potentials, and describe the physical nature of the force fields. Set up the potential in cases where that can be done : (a) F x = -^â€”j F y = -f^-v F z = 0. x 2 + y 2 x 2 + y 2 (b) F x = ' y , F v = . ~ X . -, F. = 0. Vx 2 + y 2 Vx 2 + y 2 (c) F x = xf(r), F y = yf(r), F z = zf(r), where fir) is an arbitrary function of the distance from the origin. id) F x = Mx), F v = My), F z = /,(*). 7. Prove that Ix + my + nz = k, where I, m, n, k are constants, and J2 _|_ TO 2 ^_ n 2 â€” i } i s the equation of a plane whose normal has the direction cosines I, m, n, and whose shortest distance from the origin is k. 8. Taking the potential field from Prob. (5), find the line integral $F>ds around a square of arbitrary size in the xy plane, with the origin at its center. Show by direct calculation that the integral always vanishes. Do the same for a path made up as follows: the part of the square of side 2a, made of lines at x = â€”a,y= Â±a, which lies at negative values of x, and the part of the circle of radius a, center at the origin, which joins onto and completes the figure for positive x's. 9. Prove that A â– (B X C) = B â– (C X A) = C â€¢ (A X B), where A, B, C are any vectors. Show that these are equal to the determinant A x Ay A z B x By B z c x Cy C z 10. Prove that A X (B X C) = B(A â– C) - C(A â€¢ B), where A, B, C are any vectors. 11. Prove that div aF = a div F + (F â– grad a), where a is a scalar, F a vector. 12. Prove that curl aF = a curl F + [(grad a) X F], where a is a scalar, F a vector. 13. Prove that div (F X G) = (G â€¢ curl F) - (F â€¢ curl G), where F, G are vectors. 14. Prove that div curl F = 0, where F is any vector. 15. Prove that curl curl F = grad div F â€” V 2 F, where F is any vector. CHAPTER VII LAGRANGE'S EQUATIONS AND PLANETARY MOTION In considering mechanical problems with several variables, it is seldom very convenient to use ordinary rectangular coordinates. In working with problems in the motion of particles, we often wish to introduce curvilinear coordinates, as for instance polar coordinates. With rigid dynamics, we often use rather com- plicated quantities to give the orientation of a rigid body in space. For instance, with a top or gyroscope, we may use Euler's angles, namely, the latitude and longitude angles of the axis of the top with reference to a fixed north pole, and the angle of rotation of the top about its own axis. All these coordinates come under the general description of generalized coordinates. Any quantities which are capable of describing the positions of the parts of a system, whether they be distances, angles, or any other quantities, can serve as generalized coordinates. Now when we begin to examine the equations of motion in generalized coordinates, we naturally find that they can be very complicated. In a later section we shall start with the ordinary equations of motion in rectangular coordinates, intro- duce new coordinates as functions of the old, and find the new equations of motion by direct change of variables. We find many new terms coming in, as soon as the change of variables is at all complicated. But we shall find that there are several fairly simple ways of writing the equations of motion, different in form from Newton's equations, though essentially identical, which preserve their simple form even in generalized coordinates. The most elementary of these methods is that of Lagrange's equations, and we consider them in this chapter. 41. Lagrange's Equations. â€” We start our discussion of Lagrange's equations merely by restating Newton's second law of motion, in a slightly different way. For the moment, we consider only problems where there is a potential energy function. Since F x = â€”dV/dx, etc., the equations of motion, written in terms of momenta, are 58 LAGRANGE'S EQUATIONS AND PLANETARY MOTION 59 d(mv x ) _ _dV (i\ dt dx etc. There is an interesting way in which these equations can be written. Let the kinetic energy be called T, so that T = | (IV s + v y * + v z >), (2) if it is written in terms of the velocity components. If we keep this form, we observe that mv x = dT/dv x , which we note is the x component of momentum. Hence we can write our equations i/^+^.o, (3) dt\dvj ^ dx etc. But this can be put in another form, if we let T â€” V = L, called the Lagrangian function (and different from the total energy, which is T + V). T is to be considered a function of the velocity components, and V of the coordinates, so that L is a function of all these six variables. Since T depends only on the v'a, and V only on the x's, we have dT/dv x = dL/dv x , dV/dx = â€” dL/dx, etc. Hence the equations of motion are d/MA_3L (4) dt\dv x / dx with similar equations for y and z. In this form, the equations are called Lagrange's equations of motion, and they are simply convenient ways of writing Newton's second law of motion. As we have stated, the importance of Lagrange's equations is that they hold in any sort of coordinates, not merely in rectangu- lar coordinates. Thus, if the coordinates are q x . . . q n , and their time derivatives are qi . . . q n , the equations are l(?Ii) - ?k = (5) dt\dqij dqi Here as before L = T - V, but now it is no longer true, as before, that T depends only on the velocities, V only on the coordinates. Instead, T generally involves the coordinates as well, so that the term dL/dqi has some contributions coming from dT/dqi, which are evidently absent in rectangular coordi- nates. We shall see by an example that these terms are a sort of fictitious force introduced by using the generalized coordinates, 60 INTRODUCTION TO THEORETICAL PHYSICS and of which the centrifugal force in polar coordinates is a typical case. We postpone a proof of Lagrange's equations to a later section, giving first an example of their usefulness by discussing i.he motion of a particle in a central field, as a planet about the sun. 42. Planetary Motion. â€” As an example of two-dimensional motion, and of the Lagrangian equations, we consider the case where V = V(r), a function only of the distance r from a given point. This problem is almost impossible to discuss completely if we use rectangular coordinates, but if we take polar coordinates, r, 0, we find that we can separate variables, and that the problem is then easily solved. To apply Lagrange's method to this case, we write L as a function of r, 0, r, and 0. Then we have d(dL\ _ dL = dt\df/ dr ~ ' d/dL\ dL n First we find L. The velocity is made up of two vector compo- nents at right angles, along the radius and along the tangent to a circle. The first is r, the second rd, so that v 2 = f 2 + r 2 2 , and L = T - V = ^(r rH 2 ) - V(r). Differentiating, dL ^-r = mr, dr â€” = mr J 6, dd ^â€” = mrb 2 â€” dr dV dr dd (7) Then the equations are fairnr) - mrd 2 + -^ = 0, ~{mrH) = 0. (8) The second may be immediately integrated: mr 2 Â§ â€” constant. This has a simple interpretation, for mr 2 6 is simply the angular momentum, since mr 2 is the moment of inertia, 6 the angular LAGRANGE'S EQUATIONS AND PLANETARY MOTION 61 velocity, and our equation states that it is constant, since no torque is acting. As a matter of fact, dL/dqi is called the generalized momentum associated with the generalized coordinate q if and linear and angular momenta are special cases of the generalized momentum. Let then mr 2 6 = p, where p is a constant (momenta are conventionally called p, as coordi- nates are called q). Next we may consider the first equation, m d 2 r/dt 2 = mrd 2 â€” dV/dr. The first term on the right-hand side is at first unexpected. But when we look at it, we see that it is the centrifugal force, which must be added to the external force to produce the radial acceleration. We can now solve our equations. Setting mr 2 6 = p, we have 6 = p/mr 2 , so that m d 2 r/dt 2 = p 2 /mr 3 â€” dV/dr = â€”d/dr(V + p 2 /2mr 2 ). We have separated the variable r from 0, and the result is just like the equation for a one-dimensional problem with a potential V + p 2 /2mr 2 , the latter being a sort of fictitious poten- tial energy coming from the centrifugal force. For example, if the force is a gravitational one, V = â€”Gmm'/r, where m' is the mass of the attracting body, G the gravitational constant, so that we have the problem of the apparent potential â€”Gmm'/r -f- p 2 /2mr 2 . Except for the constants, this is the case of the poten- tial â€” (l/x) + 0-/x 2 ), which we have already taken up in Probs. 3 and 4, Chap. V. We showed there that motions of negative energy are oscillatory in r, so that the orbit is concentrated in a finite region, and motions of positive energy go to infinity. We leave the exact discussion to a problem, but it proves to be true that the finite orbits are periodic and are ellipses with the attract- ing center at one focus, while the open orbits are hyperbolas. This is, however, a special case, and we proceed to a qualitative discussion of the general central motion, by the method of energy. 43. Energy Method for Radial Motion in Central Field. â€” We have seen that the radial motion of a particle in a central field is just like the one-dimensional motion of a particle in a potential V + (p 2 /2mr 2 ), where p is the constant angular momentum. This problem can be discussed as in Chap. V, plotting the curve V + (p 2 /2mr 2 ) as a function of r, and drawing the horizontal line at height E, as in Fig. 6. Aside from this, we can make no general statement. But in many important physical cases, the curve resembles A or B in Fig. 9, the rise at r = arising from the centrifugal force, and the potential V representing attraction in A, repulsion in B. With energy E h in either case, the motion 62 INTRODUCTION TO THEORETICAL PHYSICS would come in from infinity to a smallest distance (c or d), called the perihelion, from the astronomical analogy, perihelion meaning near the sun. It would then reverse, and travel outward for infinite time. The energy E 2 , however, would represent no possible motion with the curve B, but with the attractive poten- tial A, which resembles the gravitational attraction mentioned in the preceding section, there would be oscillatory motion between the perihelion a and the aphelion b. This motion Fig. 9.â€” Curves of 7 + 2mr 2 as functions of r. Case: A, attraction; B, repulsion. With energy Ei, motion goes to infinity with either potential; with Ei, motion impossible with curve B, oscillatory between limits a and b with curve A- would be periodic, and the radius as function of time, and like- wise the period, could be computed by the method of the energy integral discussed in Chap. V. 44. Orbits in Central Motion. â€” The best picture of central motion is obtained by considering the orbit in space, as in Fig. 10. Suppose we consider a motion oscillatory in r, as the case E 2 of Fig. 9. Then we may draw two circles, of radii equal to the perihelion and aphelion distances, respectively, and the motion will take place between the circles. The orbit must be LAGRANGE'S EQUATIONS AND PLANETARY MOTION 63 tangential to both circles, as shown. If the motion starts on the outer circle, the particle will move with continually decreasing radius until it touches the inner circle. At the same time, however, on account of the angular momentum, it will be turning around, and the angle made by the radius vector will have turned through a definite amount between the points of contact with outer and inner circles. After touching the inner circle, the whole procedure is reversed, r increasing to the maxi- Fig. 10. â€” Orbit of a particle in central motion. mum value, so that after a certain time the point will touch the outer circle again. Now between the two successive points where the orbit touches the outer circle, there will be a certain length of arc. It may be that this is a rational fraction, say m/n, of the circumference, where m and n are integers. In that case, after n excursions to the center and out again, the aphelion point will have gone around the circle m times, and will have come back to the starting point. Thus the motion is periodic, repeating itself after a certain length of time. For example, if the particle is attracted to the center according to the inverse square, m/n is just 1, and the particle always comes back to the same point on the circle. But if the length of arc is an irrational fraction of the cir- cumference, as in Fig. 10, the motion is not periodic, and will never repeat itself. Nevertheless, it is what is called doubly periodic. The motion resembles a slowly rotating ellipse, 64 INTRODUCTION TO THEORETICAL PHYSICS rotating so that successive aphelion points, instead of lying on top of each other, are displaced with respect to each other by a given angle. This slow rotation is called precession, and one can find the frequency, and angular velocity, of the precessional motion. If now we imagined a turntable to rotate with the precessional frequency, and traced out the motion on this turn- table, the path would be closed, somewhat like an ellipse. In other words, the whole motion is a combination of a periodic motion, superposed on a rotation. These two motions have in general entirely independent frequencies, and that is the origin of the statement that the motion is doubly periodic. 45. Justification of Lagrange's Method. â€” We shall now show in our special case of polar coordinates how Lagrange's method could be justified, using this as a model for the general treat- ments Surely the equations of motion are dH bV d>y _ _bV m W* ~ dx' m dt* ~ by' We introduce the polar coordinates, x = r cos 8, y = r sin 8. Then dx/dt = cos 8 dr/dt â€” r sin 8 dd/dt, d 2 x d*r a . a drd6 . Q d 2 8 Jd8\ 2 jTa = -TPl cos 8 - 2 sin %^-rsin^-r cos 8\^J , (9) dp-dPâ„¢" "^"dtdt ,a ^"dt d2 y _ d ' r â€žâ€¢â€ž a _i_ o _ A de L . _ a d " e _ . â€žâ€¢â€ž Jm\ dt* - dt* sind + 2 cos e dtdt +rcose dÂ¥ - rsin *l - ' â€¢ (10) Using these, we can obtain the equations of motion in x and y. But now multiply Eq. (9) by cos 8, Eq. (10) by sin 8, and add. The result on the left is w d 2 r/dt 2 â€” mr(dd/dt) 2 , and on the right â€” (dV/dx cos 8 + bV/by sin 8), which is just â€” bV/br, since the latter should be â€” (bV/bx bx/br + bV/by by/br), and bx/br = cos 8, by/br = sin 8. Thus we have the first of Lagrange's equations. Next, multiply Eq. (9) by â€” r sin 8, Eq. (10) by r cos 8, and add. On the left, we have 2mr dr/dt dd/dt + mr 2 d 2 8/dt 2 , which equals m d/dt(r 2 d8/dt) f and on the right we have r bV/bx sin 8 - r bV/by cos 8 = -bV/b8. Thus the second equation becomes m d/dt(r 2 dd/dt) = â€” bV/bd, the second of Lagrange's equations (whose right member is zero in the case of a central field). Just such a change of variables can be carried out in the general case. Suppose that, for the sake of simplicity, we still take only LAGRANGE'S EQUATIONS AND PLANETARY MOTION 65 two dimensions; the general proof goes through in just the same way, except with more complicated expressions. We start with two rectangular coordinates x and y, in terms of which we have the ordinary Newtonian equations m d 2 x/dt 2 = â€” dV/dx, m d 2 y/dt 2 = â€” dV/dy, and two generalized coordinates gi and q 2f given as functions of x and y, so that q x = qi(x, y), q 2 = q 2 ( x , y), or conversely we can write x and y as functions of 4i and q 2 : x = x(q h q 2 ), y = y(qi, 92). We must remember carefully what these quantities are functions of, in taking partial derivatives. Now we have dx dx dqi dx dq 2 ~dt ~ dqi dt dq 2 dt' dH _ dx d 2 qi dx d 2 q 2 dt 2 ~ dqi dt 2 + dq 2 dt 2 dqi/j^x dqi d 2 x dq 2 \ + ~dt\dqi 2 dt + dqidq 2 dt ) dq 2 / d 2 x dqi d 2 x dq 2 \ + ~dt\dq x dq 2 dt + dq 2 2 dt ) with a similar equation for d 2 y/dt 2 . In terms of these, we set up the equations m d 2 x/dt 2 = - d V/ dx, etc. Then we multiply thr first by dx/dq h the second by dy/dqi, and add. We have ny dx v , /jyyygi 4. /iÂ£ ^ _l ^ J*]L\^ m \[\dqi) + \d qi ) J dt 2 ^ \dqi dq 2 "*" dqi dq 2 ) dt 2 (d%_ d^x dy_ d 2 y\/dqi\ 2 + \dqi dqi 2 ^ dqi dqi 2 )\dt ) + \dq x dq x dq 2 "*" dqi dq x dq 2 ) dt dt ,(jtejPxdy_ Vy V^Yl "*" \dqi dq 2 2 ~*~ dqi dq 2 2 )\ dt ) j /dV dx dV dy\ _ _dV_ \ dx dqi dy dqi) dqi It will next be shown that the rather complicated expression on the left is equal to d/dT\ _ dT } dt\dqi) dqi where T is the kinetic energy. To do this, we first have 66 INTRODUCTION TO THEORETICAL PHYSICS 2[^agri dz "^ a? 2 d< J "*" \d qi dt + ag 2 dt ) Then by differentiation, remembering that q x = dgi/dfc, dT dqi â€” m dt\dq . dq x dqi dt dt / dx dqi \dqi dt + ~~ dgA cte /_dy dgi jfy dgAdy 5x dg 2 ^ y^gi i/ |_ W dt* ^ dq 2 dt* )dq x + 1 ^ ^7F + â€” â€” Â»â€” d / Bx^ \dq x dt ' dq 2 dt J dqi dy d 2 q 2 \ dy ' \dqi dt 2 "â– " dq 2 dt 2 J dqi + (ib\ dt J _dq\dqij "*" dq\dqi) J A/iiY + jl/iy Y+ J_/^Â£ _^\ . JL/ifo ** Y dq 2 \dqij ^ dq\dqi) ^ dq\dq t dq 2 ) "*" dgi^dgi a? 2y / _a /3a; a^x a / ay g y \l) _dq 2 \dqi dq 2 ) + d^dgi dtfajjj Also d?i â– l\^i <# ag 2 dt )\dqi 2 dt + agiag 2 <ft j /ay dqi dy^ dqA/ tPy dqi d 2 y dq 2 \l \dqi dt ~ 1 ~ dq % dt )\dqi 2 dt ~^~ dqidq 2 dt ) \ + Combining these two expressions, it is easy to see that we have just the quantity which we desired. We have then the equation d/dT dt\dq D- dT dqi aF dqi If we set L = T â€” V, and remember that, since V does not depend on the velocities, dV / dqi = 0, this becomes d/dL dt\dq dL Similarly we can prove the equa- or Lagrange's equation for q x . tion for q 2 . It is worth remarking that the method which we have used for proving Lagrange's equations, though straightforward and simple in principle, is not the one usually employed. More often a derivation is given using the calculus of variations, which avoids most of the algebraic complications, but which on the other hand is more difficult in the fundamental ideas involved. LAGRANGE'S EQUATIONS AND PLANETARY MOTION 67 Problems - 1. A particle of mass mis attracted to a center by a force â€”Gmm'/r 2 . Find perihelion and aphelion distances as a function of energy and angular momentum. Assuming that the orbit is an ellipse, prove that its major axis is â€”Gmm'/E. 2. In Prob. 1, show that it is possible for perihelion and aphelion distances to be equal, so that the orbit is circular. Find the necessary relation between energy and angular momentum for this to happen, and check this relation by elementary discussion, balancing the centrifugal force in the circular motion against the attraction. 3. A particle in an inverse square field executes an elliptical motion with the center of attraction as a focus. Find the period of this motion, by considering the radial motion, proceeding as in Prob. 5, Chap. V, using the results of that problem if you wish, but finding the period in terms of energy and angular momentum. 4. Discuss in detail the motion of a planet about a sun, proving that, if the energy is negative, the orbit is elliptical with the sun at a focus, and finding the relations between the major and minor axes of the ellipse and the energy and angular momentum. A procedure for the discussion is sug- gested as follows: Assuming the angular momentum to be p = mr 2 = constant, show v 2 T /du\ 2 ~| 1 du that the energy is ^- I -r ) + u 2 â€” Gmm'u, where u = â€” Find -^ from the equation of an ellipse in polar coordinates, with one focus as a pole, which is u = â€¢ â€” jz. ^â€”> where a is the semi-major axis, e the eccentricity, a(\ â€” e 2 ) so that b, the semi-minor axis, is given by b 2 /a 2 = 1 â€” e 2 . Substituting your value of du/dd into the expression for energy, show that the result is a constant, independent of B, and equal to E, if the major axis and eccentricity are properly chosen. 5. Suppose a particle of mass to, charge e, collides with a very heavy particle which has charge e', so that it repels with a potential energy ee'/r. The first particle is moving with a velocity v at a great distance, and is aimed so that, if it continued in a straight line, it would pass by the center of repulsion at a minimum distance R. Note that this determines the angular momentum. Using the energy method, find the perihelion distance as a function of R and the velocity of the particle. 6. Discuss in detail the motion of the particle of Prob. 5, showing that it will be deflected so that after the collision the line of travel will make an angle d> with the initial direction, where tan jr- = j=- Such deflections are Â° 2 mvo 2 R observed in collisions between alpha particles and atomic nuclei, in Ruther- ford's scattering experiments. Suggestions: the particle executes a hyperbolic orbit, and the desired angle is the angle between the asymptotes. Now the equation of a hyper- bola in polar coordinates is just like that of an ellipse, as given in Prob. 4, except that the eccentricity is greater than 1, so that the term 1 â€” e cos can become zero, and r infinite, giving the angles of the asymptotes in 68 INTRODUCTION TO THEORETICAL PHYSICS terms of e. We need then only determine e in terms of energy and angular momentum, from the equations found in Prob. 4. 7. A two-dimensional linear oscillator is attracted to a center by a force proportional to the distance, or F x = â€” ax, F v = â€”ay. Solve in rectangu- lar coordinates, separating variables, showing that x and y execute independ- ent simple harmonic vibrations of the same frequency. Prove that the resulting orbit is an ellipse, with its center at the center of attraction. 8. Taking the solution of Prob. 7 in rectangular coordinates, find the angular momentum vector by ordinary vector formulas from the displace- ment and velocity, and prove by direct computation that it remains con- stant. Find the angular momentum as a function of the dimensions of the elliptical orbit, and show its connection with the area of the orbit. 9. Set up the problem of the two-dimensional linear oscillator, as in Prob. 7, using polar coordinates. Separate variables, solve the radial problem by the energy method, compute the period in this way, and show that it is in agreement with the period as found in Prob. 7. CHAPTER VIII GENERALIZED MOMENTA AND HAMILTON'S EQUATIONS In the last chapter we have found the equations of motion in generalized coordinates, but we have not considered the mean- ing in these coordinates of the simple concepts of momentum and force. We shall accordingly examine these questions, and shall see that the equations can be interpreted in the form that the force equals the time rate of change of momentum, which as we know is a more fundamental statement than that it is the mass times acceleration. Using the momentum, we can then restate the equations in a form called Hamilton's equations, equivalent to Lagrange's equations, but more powerful in some applications to advanced mechanics. 46. Generalized Forces. â€” In many mechanical problems we have to deal with forces which cannot be derived from a potential. Let us see how such forces may be included in the Lagrangian scheme. For simplicity we take a two-dimensional problem, and let the x and y components of force be F x and F v , which may depend on time, velocity, etc., as well as position. For gen- erality, we assume that part of the force can be derived from a potential, the rest not, so that we have F x = â€” (dV/dx) + FÂ» t etc., where FJ is the part of the force not derivable from a potential. Now if we proceed with the proof of Lagrange's equations as in the last chapter, we easily find dt\dqj dq x dqi\ * dq x "*" v dqj with a similar equation for g 2 . We may introduce as before a Lagrangian function, containing the part of the external forces derivable from a potential: L = T â€” V. Then d(dL\ dL â€ž ,'dx , w ,dy _ n m dt\dqi/ dq x dq x aq x with a similar equation for qz, where Qi, Q2 are called the gen- 70 INTRODUCTION TO THEORETICAL PHYSICS eralized forces connected with the coordinates q 1} q 2 . The equa- tion in this form may be used to discuss any arbitrary problem, for example of damped motion, in generalized coordinates. It is worth noting that these generalized forces are closely related to the work done in an arbitrary displacement, just as ordinary forces are in rectangular coordinates. For imagine the generalized coordinates changed by amounts dq h dq 2 . There will be a certain amount of work done on the system, equal to â€” dV + dW, where dW is the work done by the external non- conservative force F' (a force is spoken of as conservative if it is derivable from a potential, nonconservative otherwise). Now in general we have dW = F x 'dx + F v 'dy -(<+ 'â– '$*+ ('â– 'Â£+ '-'Â£)* = Q\dq x + Q 2 dq 2 , (2) or the sum of products of generalized forces by generalized dis- placements. It is, of course, plain that all these arguments work equally well with more than two generalized coordinates. The forces Q which we have just introduced were the external applied forces not derivable from a potential. But we may well consider all the forces together. We could write Lagrange's equations as dt\dqij l dqi dg* The three terms on the right of Eq. (3) may be taken to be three terms of the force. The first is the generalized force not derivable from a potential, the second the force derivable from a potential, the third the fictitious force, like a centrifugal force, arising from the fact that the coordinate system is not rectangular. Equation (3) states that this total force equals the time rate of change of a certain quantity, and it seems reasonable to consider this quantity as a generalized momentum. 47. Generalized Momenta. â€” In simple cases the quantity dL/dqi plays the part of a momentum. Thus in rectangular coordinates, we have dL/dx = mx, or exactly the momentum associated with the coordinate x. Similarly in polar coordinates GENERALIZED MOMENTA AND HAMILTON'S EQUATIONS 71 the quantities associated with r and 6 are mf, the radial momen- tum, and mr 2 9, the angular momentum, respectively. These are but examples of a general rule, and as a matter of fact we define dL/dqi to be the generalized momentum associated with the coordinate q i} denoting it by pi. We note that generalized momenta are not of the same dimensions as ordinary momenta, in general; they are not simply components of the momentum referred to other coordinates. Similarly generalized forces are not simply components of forces. For instance, it is easily shown that in polar coordinates the generalized force Q r is the com- ponent of force along r, but Q e is the moment of force, or torque, which by Eq. (3) above equals the time rate of change of the angular momentum. 48. Hamilton's Equations of Motion. â€” Assuming no external forces Q, we could evidently write Lagrange's equations in the form dpi/dt â€” dL/dqi â€” 0, or dp { /dt = dL/dq t , which, taken together with the definitions p { = dL/dqi, would form a complete system. But there is a neater method, known as Hamilton's method, which we use instead. We can first see how Hamilton's equations are set up in rectangular coordinates. There we have T = (m/2)(x 2 + y 2 + z 2 ). Then it is true that we have, for instance,' dL dT Vx = -rr = tt = mx. dx dx We can also write T, not in terms of the velocities x, y, z, but in terms of the momenta p x , p y , p z . Since x = p x /m, we have T(P*> Vv, p.) = ^(P* 2 + pS + p."), where we must specify that T is a function of the p's. Then we have dT(p x , p y , p z ) _p x _ mx . dp x m m ' and similarly dT(p)/dp v = y, dT(p)/dp z = z. These take the place of the equations p x = dT{x, y, z)/dx, etc. Now in Hamilton's method we set up what is called the Hamil- tonian function H. This is in all ordinary cases simply the total energy T + V, in which T is expressed in terms of the momenta, rather than the velocities. Thus we have H = H{q i} p t ), mean- 72 INTRODUCTION TO THEORETICAL PHYSICS ing that it is a function of the coordinates and momenta. Then dH = dT ,9V dqi dqi dqi which in rectangular coordinates gives dV/dq< = â€” dL/dq i} so that in this case Lagrange's equation becomes dpi/dt = - dH/dqi. Similarly dH = dT = . = dqi dpi dpi Qi dt' The resulting equations are called Hamilton's equations: dqi = dH dt dpi dpj = _dH ... dt dq t w It is evident that they show a symmetry between p { and q if which is one reason for preferring them over Lagrange's equations. For a given problem, there are twice as many Hamiltonian equations as Lagrangian equations, but they are only first-order rather than second-order differential equations, so that it comes down essentially to the same thing. 49. General Proof of Hamilton's Equations. â€” Our proof holds only in rectangular coordinates, and we must next give a general proof. As before, we start with Lagrange's equations, which we assume are correct, and we define the momenta as derivatives of the Lagrangian function with respect to the velocities. Then we set up the Hamiltonian function in terms of the Lagrangian function, by the equation h = 5)p^ - L - (5) i This seems at first quite different from our elementary definition of H as the energy, but we shall show in the next paragraph that it is equivalent. We express the Hamiltonian in terms of coordi- nates and momenta, writing the velocities q h where they appear both in Sp,-gy and L, in terms of the momenta, so that we have H = JjpMvk, qk) - L[qj(Pk, qt), Qil GENERALIZED MOMENTA AND HAMILTON'S EQUATIONS 73 Then we have HE = a 4- ^* a Jk - ^SOk b Jl d Vi qi ^ ^J v, d Vi ^Jdq f dp- i i But since by definition pj = dL/dq h the last two terms cancel, leaving dH = . = dqi dpi Qi dt' Similarly, d qi ^ Pj dq { ^idijdqi dq~- i i This time the first two terms cancel, leaving dH/dqi = â€” dL/dq if so that by Lagrange's equations. dH = _dpj dqi dt Thus we have proved both of Hamilton's equations in the general case. It remains to be shown that the Hamiltonian function, as we have defined it, is the same as the total energy. First we consider the kinetic energy expressed in terms of the velocities. This is a homogeneous quadratic function of the velocities : jk kqiqk, (6) where the A's are coefficients depending in general on the coordi- nates, and we are to sum over all possible values j and k. In particular, for rectangular coordinates, A jk = ra/2 if j = k, if j 7* k. In cases where the coordinates are orthogonal, that is, the coordinate surfaces intersect at right angles, as they do, for instance, in spherical polar coordinates, or in fact in all the coordinate systems in common use, only square terms come in, all coefficients A jk being zero if j ^ k. But in oblique coordinate systems, this is not true. Now for such a homogeneous quad- ratic expression we have the theorem 74 INTRODUCTION TO THEORETICAL PHYSICS r\/T7 â– which we can immediately prove. For ^r- = ^S(Aa + A a) fa, dT we can immediately prove, .for so that i i j The double sum is now just twice the sum of Eq. (6) which we previously gave as T, proving the theorem. Hence, using dT/dfa = dL/dqi = pi, we have T = li^Pifa, so that our defini- i tion in Eq. (5) of H gives H = 2T-L = 2T-T+V = T + V = total energy, as we wished to prove. In advanced work, one sometimes meets cases where H is not equivalent to the total energy. Such cases are found, for instance, where magnetic forces are present. But even here, the following general rules are correct : First, set up a Lagrangian function, so that the equations of motion can be written in Lagrangian form. This can sometimes, as in the magnetic case, be done, even if we cannot interpret the Lagrangian function as T â€” V; for in the magnetic case, the forces are not derivable from a potential, depending rather on the velocity, and yet vary in such a way that we can use a Lagrangian function. Next, define the momenta as pt = dL/dfa. Set up the Hamiltonian function Zpifa â€” L, expressing it in terms of coordinates and momenta. Then Hamilton's equations hold, using this Hamiltonian. 50. Example of Hamilton's Equations. â€” Let us by way of illustration work out Hamilton's equations for the problem of planetary motion, discussed in the previous chapter by Lagrange's method. In terms of the coordinates r and 6, we found that L = ^(r 2 + r 2 2 ) - V(r). Then the momenta are p r = dL/dr = mf, the ordinary momen- tum along the radius, and p 8 = dL/dd = mr 2 d, the angular momentum. Next we have 2pÂ»ft â€” L = {mf)f + (mr 2 e)6 â€” L = m(r* + r 2 2 ) - ^(r 2 + r 2 2 ) + V(r) GENERALIZED MOMENTA AND HAMILTON'S EQUATIONS 75 = ^(r 2 + r 2 2 ) + V(r) = total energy. Solving the equations for r and 6 in terms of p r and p e , we have f = Pr/m, 8 = pe/mr 2 , and substituting these in the Hamiltonian, we have H = U* + ;*â€¢') + F(r >- Then Hamilton's equations are di7 _ Pr _ dr _ . dp r m dt dps mr 2 dt ' both of which we already knew. Also _dH = pf _ dV(r) = dpr } dr mr 3 dr dt showing that the time rate of change of radial momentum equals the external force â€” dV/dr in the r direction, plus the centrifugal force p 8 2 /mr 3 (which evidently equals mrd 2 = mw 2 r = mv 2 /r). Finally â€” â€” - - dve dd dt' showing that the time rate of change of angular momentum is zero, on account of the absence of torques. 51. Applications of Lagrange's and Hamilton's Equations. â€” From our discussion one might get the impression that the only use of Lagrange's and Hamilton's equations was in introducing curvilinear coordinates in problems of the dynamics of a particle. This is, however, far from the case. For example, one may have a particle moving subject to certain constraints, as a bead sliding along a frictionless wire, or a particle constrained to move on the surface of a sphere or other surface, as the bob of a spherical pendulum must move in a sphere. Then we may often satisfy the conditions of constraint by suitable choice of the generalized coordinates. Thus, with the spherical pendulum, we may take spherical polar coordinates r, 6, <f>. We may then arbitrarily set r constant, equal to R, the radius of the sphere, and write 76 INTRODUCTION TO THEORETICAL PHYSICS Lagrange's equations for 6 and <f>. To justify this, we note that the component of the external and centrifugal force normal to the sphere will be exactly balanced by the reaction of the con- straint, just as the weight of a body resting on a table is exactly balanced by the upward push of the table. Thus the generalized force acting in the direction of r will be zero, so that a constant value for R, leading to a constant and vanishing generalized momentum along r, is a solution of the equations. For a particle on a wire, similarly, if the wire happened to be a circle, we could take polar coordinates, set r constant, and have but one equation of motion, stating that the torque acting on the particle equaled the time rate of change of its angular momen- tum. We note that these two problems are essentially equiva- lent to the spherical and ordinary pendulum, which are rigid bodies, suggesting that Lagrange's equations are of use in dis- cussing the motion of a rigid body. But we can go even further. An Atwood's machine, for instance, is a special case of coupled systems, two weights being hung by a string over a pulley. This can be described very easily by a single generalized coordinate. In the general problem of coupled systems, and in fact in all problems of interaction of different particles or systems, Lagrange's method is very suitable, as we shall see. In fact, there is hardly a mechanical problem where generalized coordi- nates are not applied. For the actual solution of problems, Hamilton's equations are generally not so convenient as Lagrange's equations. Their importance comes in the insight they give into the situation, by bringing the momenta directly into the statement of the equations, and for their relation to more advanced mechanics. The applications are principally to three fields: celestial mechanics, statistical mechanics, and quantum theory. We shall indicate in the next chapter the nature of some of these applications of Hamiltonian methods, taking up some of the general properties of the motion of particles, but postponing until later in the book the discussion of statistics and of quantum mechanics. Problems 1. An Atwood's machine is built as follows: A string of length h passes over a light fixed pulley, supporting a mass m x on one end and a pulley of mass m 2 (negligible moment of inertia) on the other. Over this second pulley passes a string of length U supporting a mass m 3 on one end and m 4 on the other, where m 3 ^ w 4 . Set up Lagrange's equations of motion for this GENERALIZED MOMENTA AND HAMILTON'S EQUATIONS 77 system, using two appropriate generalized coordinates. From these show that the mass rwi remains in equilibrium if (m 4 â€” m 8 ) 2 mi = m 2 + TO 3 + TO4 ; m 3 + mi 2. A particle slides on the inside of a smooth paraboloid of revolution whose axis is vertical. Use the distance from the axis, r, and the azimuth as generalized coordinates. Find the equations of motion. Find the angu- lar momentum necessary for the particle to move in a horizontal circle. If this latter motion is disturbed slightly, show that the particle will perform small oscillations about this circular path, and find the period of these oscillations. 3. Set up the kinetic energy, Lagrange's equations, and Hamilton's equations in spherical polar coordinates. Set up expressions for the general- ized forces acting on r, 0, and <j>, and for the generalized momenta, explaining the physical meaning of these quantities. 4. Set up the problem of a spherical pendulum subject to gravity and to a resisting force proportional to the velocity and opposite in direction. Use spherical polar coordinates. Show that for small amplitudes and no damping this problem reduces to the two-dimensional linear oscillator of Prob. 9, Chap. VII. 6. Derive the Hamiltonian equations for Prob. 4, in the general case, showing that the damping forces give extra terms in the equations propor- tional to the momenta. Show that these equations in general cannot be separated. Derive a solution, however, for the special case in which the instantaneous motion would be a rotation about the lowest point of the sphere if damping were absent. Assume small damping, so that the actual motion is a gradual spiralling in toward the lowest point. 6. The force on an electron of charge e, moving with a velocity d in a magnetic field H, is given by F = -(v X H), where c is the velocity of light. This corresponds to the ordinary motor law, in which the force on a circuit is proportional to the current (here ev/c) and to the field, and at right angles to both. In addition, the magnetic field H can be given as the curl of a vector A, called the vector potential. Show that the equations of motion of an electron moving in such a magnetic field, and in addition in a potential field of potential V, can be described by Lagrange's equations, with the Lagrangian function/^ = T â€” V + (e/c)(v -A}. Assume the vector poten- tial, and magnetic field, to be independent of time, but note that ^ _ ^ .dAahc dAdy dAdz- , dA _ dt ~ dt ~ l ~ dx dt + dy ~dt + ~dz W re ~dt ~ 7. For the particle of Prob. 6, set up the momentum and the Hamiltonian function. Show that the momenta do not equal mass times velocity, and the Hamiltonian does not have the form p 2 /2m + V. 8. In the relativity theory, the equations of motion of a particle are different from what they are in classical mechanics, though they reduce to the same thing for small velocities. In particular, the mass of a particle 78 INTRODUCTION TO THEORETICAL PHYSICS increases with velocity. If a particle has a mass m when at rest, its mass at speed v is given by Wo TO = r, VI - *> 2 /c 2 where c is the velocity of light; reducing to m in the limit v/c = 0, but becoming infinite when the particle moves with the speed of light. Show that the equations of motion are correctly given from the Lagran- gian function â€” WoC 2 \/l â€” v 2 /c* â€” V, when we remember that the momen- tum equals the velocity times the (variable) mass. Derive the Hamiltonian function from the Lagrangian function. Setting the Hamiltonian function equal to T + V, where T is the kinetic energy, &how that the Lagrangian function is not equal to T â€” V, as is natural from the fact that the kinetic energy is not a homogeneous quadratic func- tion of the velocities. Taking the kinetic energy, expand in power series in the quantity v/c, showing that for low speeds the kinetic energy approaches its ordinary classical value, except for an additive constant moc 2 . This additive constant, which always appears in relativistic energy expres- sions, is interpreted as meaning that the mass of the particle is really equiva- lent to energy, 1 gm. being convertible into c 2 ergs of energy. CHAPTER IX PHASE SPACE AND THE GENERAL MOTION OF PARTICLES As in one-dimensional motion, we can make a great deal of use of the energy in discussing motion in two and three dimen- sions. In a conservative system with potential energy V, the motion can occur only in those regions of space where E â€” V is positive, if E is the total energy, and we can thus divide up our possible problems into those occurring within a finite region and those going to infinity. As with one-dimensional motion, there are sometimes periodicity properties associated with the finite motions, which we discuss in the present chapter. With two-dimensional motion, we can visualize the use of the energy very easily, plotting 7 as a height in a three-dimensional graph, the result looking like a relief map, or else drawing equipotentials, which represent the potential as the contour lines represent height on a map. For a total energy E, we imagine the map filled with water up to a level E, so that the submerged parts, lakes and oceans, represent the regions where the motion occurs. We may also use the analogy of the rolling ball in two dimensions as well as in one, imagining that a ball starts rolling down the side of a hill in our relief map, climbing up the valley on the other side, and oscillating back and forth. From physical intuition as to the motion of such a ball, we can derive much information about complicated forms of motion. There is one great complication present in motion in several dimensions which was absent in one-dimensional motion. In that simpler case, the velocity of a particle was determined at each point of space in a conservative motion, only the direction, forward or back, being arbitrary. Here, however, while the magnitude of the velocity is still determined, there are an infinite number of possible directions associated with the same magni- tude. To describe a motion completely, then, even if we know . its energy, we must give as well the velocity components, or else the momenta, at each point of the path. This is accom- 79 80 INTRODUCTION TO THEORETICAL PHYSICS plished by describing the motion, not in ordinary space, but in the so-called phase space, in which there are dimensions asso- ciated with both the generalized coordinates and the generalized momenta. And the importance of Hamilton's equations arises from the fact that they are peculiarly suited to a discussion of motion by means of the phase space. 52. The Phase Space. â€” For a system with n degrees of freedom and n generalized coordinates qi . . . q n , the phase space is a 2n-dimensional space in which q x . . . q n and p x . . . p n are plotted as variables. A single point in this space, often called a representative point, then determines all coordinates and veloci- ties of the system. As time goes on, the representative point moves about the space, as both coordinates and momenta change with time. It is here that we make connection with Hamilton's equations, for these equations, dqi/dt = dH/dpi, dpi/dt = â€” dH/dqi, give just the components of the velocity of the representative point in the phase space. The problem of dynam- ics is to investigate the path of the representative point in the phase space. We can easily see some properties of this motion. In the first place, it takes place with constant energy, assuming that we are dealing only with conservative forces. To prove this, we have for the time rate of change of H dH = dHdqidHdq2 dH dp x dH dp 2 dt d<7i dt + dq 2 dt + dp x dt ~*~ dp 2 dt i ~ _ dHdH ,dHdH _ dH, ~ dq\ dpi dq 2 dp 2 dp = 0. +^+ â€¢ â€¢ â€¢ +4-f )+f (-1 )+ Now the energy H is a function of the coordinates and momenta, and hence a function of position in the phase space. Thus the equation H = constant determines a single relation between all the p's and q's, and hence is the equation of a (2n â€” ^-dimen- sional hypersurface in the 2n-dimensional space. The repre- sentative point now moves about, but always stays on a single energy surface. If in addition there are other quantities which stay constant, as, for instance, an angular momentum, each one of these quantities gives an additional equation between the p's and q's, so that the representative point can move only in the intersection of all the various surfaces represented by these equations. Thus in some cases the region in which the motion PHASE SPACE AND GENERAL MOTION OF PARTICLES- 81 occurs is of smaller dimensionality than 2n â€” 1. The extreme case is purely periodic motion; in that case there are enough quantities staying constant so that the motion of the repre- sentative point is in a single closed line in phase space, or a one-dimensional region. This fits in with the fact that all one- dimensional conservative motions not extending to infinity are periodic: for these have n = l, 2n â€” 1 = 1, so that the energy "surface" itself reduces to a line. Motions are possible in all the intermediate cases between the periodic motion, and the other extreme in which the representative point comes eventually arbitrarily close to every point of the energy surface. The latter type of motion is called quasi-ergodic, (ergodic motion being a nonexistent type in which the point passes through every point of the surface). Some of the intermediate types are multiply periodic motions, like our doubly periodic motion in the central field. We shall investigate some of these typical cases by means of examples. Fig. 11. â€” Phase space for a linear oscillator, with line of constant energy E. 53. Phase Space for the Linear Oscillator. â€” As an illustration of one-dimensional motion, we may take a linear oscillator (see Fig. 11). The phase space is two-dimensional, so that the energy surface is really a line. If the energy is \mv 2 + 2ir 2 mv 2 x 2 , where we readily see that v is the frequency of oscillation, the Hamiltonian function is p 2 /2w + 2ir 2 mv 2 x 2 . Setting this equal to a constant, E, the equation of the line of constant energy is p 2 /2m -\- 27r 2 mÂ»' 2 x 2 = E, or + = 1, (1) (V2mEy ' (\/#/2tW) 2 the equation of an ellipse, having semi-axes s/2mE and \/E/2* 2 mv 2 . 82 INTRODUCTION TO THEORETICAL PHYSICS 54. Phase Space for Central Motion. â€” As an illustration of a two-dimensional problem, we may take a central motion. The phase space is four-dimensional, so that we cannot directly plot it: the axes represent r, 6, p r , pe. But we recall that in central motion p e stays constant, so that we may choose a particular value of p , and use a three-dimensional section of the phase space, the axes representing r, 0, and p r . We imagine r and as rectangular coordinates in a plane, and p r as a coordinate at right angles to the plane. Now the energy surface is given by Pr + Vo 2m 2mr 2 + V(r) = E = constant, (2) or solving for p r , p r = Â±\/2m(E â€” V â€” p 2 /2mr 2 ). That is, for each value of r and (E and pe being fixed), two values of p r are given by the equation. If we plot these values, we get the surface of Fig. 12, on which the representative point moves in a spiral around the cylindrical surface, continually increasing, while r increases and decreases as the point spirals round and round. Although the orbit criss-crosses on itself, as we saw in Fig. 12. â€” Surface of constant energy and constant angular momentum in phase space of a particle moving in a central field. The spiral represents the path of a particle. Fig. 10, Chap. VII, still in the phase space the two different possible directions of motion at a given point of space are on opposite sides of the energy surface. In Fig. 12, we have plotted only the part of the energy surface between = and = 2x. The spiral, however, continues indefinitely. Since the regions from = 2t to 47r, 4t to Or, etc., all represent the same regions of space as to 2ir, it is reasonable PHASE SPACE AND GENERAL MOTION OF PARTICLES 83 to telescope these sections of the surface on to the one we have drawn. Each one will have its own segment of spiral, so that we shall have an infinite number of pieces all drawn on the surface shown in Fig. 12. There are now two possibilities. First, the motion can be periodic, as mentioned in Chap. VII. Then infinitely many segments of the spiral will lie on top of each other, resulting in one or a finite number of segments only. This is then a one-dimensional line in phase space, as we expect for periodic motion. Or second, the motion can be doubly periodic, the general case for this problem. In that case, the infinite number of segments of the spiral will not coincide, and instead they will fill the whole surface densely. In other words, in this case the path of the representative point fills a two-dimensional region. This is characteristic of doubly periodic motions. 55. Noncentral Two-dimensional Motion.â€” Let us consider motion in a field only slightly different from a central one, as, for instance, if we had a central field and a small external field of some other sort superposed. Then there will be slight torques acting on the particle, so that its angular momentum will change slowly. Now a given angular momentum corresponds to a given surface in Fig. 12. Hence in this motion the representative point does not confine itself to the surface, we have drawn, but moves also on larger and smaller surfaces. If the motion without the additional torques were doubly periodic, and if no new regularities were introduced, the path of the representative point would now fill densely a whole set of surfaces with continuously varying sizes; that is, it would fill up a three-dimensional volume, the most general thing possible. In most cases this volume would be the whole region consistent with the energy, so that the motion would be quasi-ergodic. The motion itself, in two- dimensional coordinate space, would resemble for a short time the orbit of Fig. 10, Chap. VII, but the circles to which the orbit is tangent would slowly increase or decrease in size, the loops of the orbit simultaneously getting less or more rounded. If the departure from a central field were large, we could not use this approximate description, but should have to say simply that successive loops of the orbit were not merely oriented differently, but were of different size and shape. 56. Configuration Space and Momentum Space. â€” It is not always so easy to reduce a four-dimensional phase space to three dimensions as it was with the central field. We can always 84 INTRODUCTION TO THEORETICAL PHYSICS visualize the phase space, however, by imagining a separate n-dimensional momentum space associated with each point of the n-dimensional coordinate or configuration space. If we assume that the n coordinates are the rectangular or Cartesian coordinates, then we have a simple interpretation of the condition that the representative point move on an energy surface. For this states that kinetic energy = E â€” V, or (p x 2 + p y 2 + Pz 2 ) = 2ra (E â€” V). But p x 2 + p y 2 + p* is simply the square of the radius in momentum space, so that to a given energy and a given point of space corresponds a sphere (or, with two dimensions, a circle) in the momentum space, on the surface of which the representative point must move. In quasi-ergodic motion, the representative point at one time or another comes arbitrarily close to each point of the surface of these spheres. We note that the spheres exist, and have real radii, only in that part of con- figuration space where E â€” V is positive, and where, therefore, according to the energy principle, the motion can occur. But now in the more specialized types of motion, all points of the surface of the spheres are not available for representative points. Thus, in central motion, where the sphere degenerates to a circle in the two-dimensional momentum space, only those velocities are allowed which correspond to a given angular momentum. That is, p x and p y must satisfy at the same time the equations Px 2 + py 2 = 2m(E â€” V), xp y â€” yp x = angular momentum = constant, the equations of a circle and a straight line, respectively, in momentum space. These intersect in two points or in no points, so that for some parts of the configuration space corre- sponding to positive kinetic energy there are two possible values of the momentum, and for other parts there is none, and the motion cannot occur. These excluded regions are those within the small circle in Fig. 10, Chap. VII, and outside the large circle, but within the circle on which the kinetic energy becomes zero. 57. The Two-dimensional Oscillator. â€” A second example of two-dimensional motion is provided by the two-dimensional oscillator. In Chap. VII, Probs. 7, 8, and 9, it was shown that a particle attracted to a center by the forces F x = â€” ax, F y = â€” ay, could be solved by separation of variables, each coordinate vibrating like a separate, oscillator, and the combined motions producing an elliptical orbit with the center at the center of attraction. The motion is periodic, with the same period which the corresponding one-dimensional motion would have. To PHASE SPACE AND GENERAL MOTION OF PARTICLES 85 obtain a nonperiodic motion, we make F x = â€”ex, F y = â€” ky, the force components being proportional to the displacements, but with different coefficients. It is easily seen that in this case the total force, regarded as a vector, is not in the same direction as the displacement. An example is found in the vibration of a rectangular stick, if one end is clamped and the other vibrates, since the stick is stiffer for bending in one direction than the Fig. 13. â€” Lissajous figure for the orbit of a two-dimensional oscillator. The ellipse surrounding the rectangle represents the equipotential corresponding to the energy of the motion. other, unless it is square. The variables are still separated in the equations of motion, and the solution is x = Ax cos (y/cjm t - Â«i), y = A 2 cos (Vk/m t - a 2 ). (3) The motion is no longer periodic, for after one period of the x motion, the y motion will not have traversed just a full period, but will be in a different phase. By plotting (see Fig. 13), one can see that the orbit is always within the rectangle bounded by x = Â±Ai,y = â– Â± A 8 , that it is often tangent to the edges of this rectangle, and that in time it comes arbitrarily close to any point within the rectangle. The resulting figure is called a Lissajous figure, and this sort of motion is typical of many examples which one meets. The orbit in central motion is, in fact, a sort of Lissajous figure, as Fig. 10, Chap. VII, shows. The two-dimensional oscillator is a typical doubly periodic motion, the periods being just those of the two separate degrees of freedom. The displacements x and y are singly periodic, but if we wished to express, for example, the displacement in an arbitrary direction as a function of time, we should have. a? -f- by, 86 INTRODUCTION TO THEORETICAL PHYSICS which would be a sum of two terms, one periodic with the one frequency, the other with the other. Inspection of Fig. 13 shows that at a given point of space, there are just two branches of the orbit, corresponding to two definite values of the momen- tum, rather than having all momenta consistent with the given kinetic energy, in all directions, permitted as in quasi-ergodic motion. A small perturbation applied to the two-dimensional oscillator would destroy the double periodicity, and make the motion quasi-ergodic. Thus we might have a small central field added to the linear restoring force. If the perturbation were small, we might apply what is called the method of variation of con- stants. That is, we could consider the coordinates to be given by Eq. (3), but regard the A's and a's as slowly varying functions of time rather than constants. Substituting these expressions in the differential equations, we should find that the perturba- tion produced such changes of amplitude and phase, at rates proportional to the magnitude of the perturbation. Considered from the standpoint of Fig. 13, this means that the rectangle is gradually changing its dimensions, subject always, however, to the condition that it is at least approximately inscribed in the same ellipse, since the total energy is only slightly changed by the perturbation. The result then is a slowly changing Lissajous figure, looking therefore like a superposition of many such figures, filling up the ellipse, and giving at a point of space not two possible momenta only, but a continuous range of momenta, in all directions, leading, therefore, to quasi-ergodic motion. A similar discussion can be given for the simpler problem of the almost periodic oscillator. In the exactly periodic case, the orbit is a single ellipse inscribed in a rectangle like that of Fig. 13, which in turn is inscribed in an ellipse. If the problem is made slightly different, by introducing only a very small difference between the force constants in the two directions, the dimensions of the ellipse can be considered to change slowly, though it always remains inscribed in the rec- tangle. The actual Lissajous figure, as one sees at once by inspection, is very similar to what one would obtain by drawing a great many ellipses, all inscribed in the same rectangle. 68. Methods of Solution. â€” We have seen one method of solving mechanical problems in several dimensions, that of separation of variables, by which the problem is reduced essentially to PHASE SPACE ANT) GENERAL MOTION OF PARTICLES 87 several independent one-dimensional problems. There are several problems which can be solved by this method, in addition to the oscillator and the central field problems which we have treated. The problem of a particle in the field of two attracting centers, both attracting according to the inverse square law, can be solved by separation in ellipsoidal coordinates, with the two centers as foci. In three dimensions, the central or the axially symmetrical fields can be solved by separation. The solutions in all these cases are multiply periodic, as we can see at once from the fact that each coordinate, acting like a one- dimensional problem, must be singly periodic. It is thus obvious that no problems except multiply periodic ones can be solved by separation, and it seems likely that the small list we have just given includes practically all the multiply periodic mechanical problems which exist in two or three dimensions. 59. Contact Transformations and Angle Variables. â€” Hamil- ton's equations can be applied to multiply periodic motions by making certain transformations of coordinates which are called contact transformations, because it can be shown that they transform two curves which are in contact with each other in the original space into curves in contact in the new space. An ordinary transformation of coordinates, of the sort which we have discussed in connection with Lagrange's equations, is a transformation in which new coordinates are written as func- tions of the old ones: q/ = q/(qi â€¢ â€¢ â€¢ q n ), if the q's are the old coordinates, the q"s the new ones. The new momenta, derived from the new Lagrangian function, are then functions of the old coordinates and momenta: p/ = p/(qi â€¢ â€¢ â€¢ q n , pi ' ' ' p n )- Such a transformation is called a point transformation. But in a contact transformation, the new coordinates as well as the new momenta are functions of both the old coordinates and momenta: qj = q/(qi â€¢ â€¢ â– q n , Pi â– â€¢ ' pÂ»), Pi = Pi(Qi ' * * Qn, Pi ' â€¢ â€¢ Pn). (4) There must naturally be restrictions on the functions, just as in ordinary point transformations we require that the new momenta be derived from the new Lagrangian function. When these restrictions are applied, however, it proves that Hamilton's equations are still satisfied in the new coordinates, thongh Lagrange's are not. Such contact transformations can often 88 INTRODUCTION TO THEORETICAL PHYSICS be very useful in complicated problems, reducing them to forms which can be handled mathematically. A contact transforma- tion can be most easily visualized simply as a change of variables in the phase space. For instance, suppose we have the phase space for a linear oscillator, as in Fig. 11. We can easily choose the scale so that the line of constant energy is a circle, rather than an ellipse. Then it is often useful to introduce polar coordinates in the phase space, so that the motion is represented by a constant value of r, and a value of 6 increasing uniformly with time. The angle 0, or rather 6/2t, in this case, is often called the angle variable, and is used as the coordinate. This is from analogy with the rotation of a body acted on by no torques, where the angular momentum stays constant, and the angle increases linearly with the time. The momentum conju- gate to the angle variable, which stays constant with time, is not simply the radius, as we should expect from the simple use of polar coordinates, but proves to be proportional to the square of r; in fact, it is just xr 2 , or the area of the circle. This momen- tum is called the action variable, or phase integral, denoted by J, and the angle variable is denoted by w. Since Hamilton's equations hold in the transformed coordi- nates, and since evidently the energy H depends only on J, being independent of w, Hamilton's equations become _dH_ dJ ^-Â°-dT (5) verifying the fact that Â«/ is a constant of the motion; and dH dw dJ dt (6) a quantity independent of time, and of w, verifying the fact that w increases uniformly with time. Now since w = 6/2ir, it increases by unity in one period, so that dw/dt is just l/T, where T is the period, or is v, the frequency of motion. Hence we have the important relation that giving the frequency of motion in terms of the derivative of the energy with respect to the action variable J. PHASE SPACE AND GENERAL MOTION OF PARTICLES 89 It can be shown in a similar way that action and angle variables can be introduced in general in one-dimensional periodic motions. In every case the w's increase uniformly with time, the frequency being given by Eq. (7). It also proves to be true in general that the action variable / is given by the area of the path of the representative point in phase space, which is the reason why it is called a phase integral. This area can be written jp dq, where this is analogous to jy dx, the area under the curve y{x). In Fig. 11, for instance, we integrate from the minimum to the maximum q along the upper branch of the ellipse, obtaining the part of the area above the q axis; then integrate back along the lower branch, where both p and dq are negative, obtaining the area below the q axis, so that the complete integral about the whole curve, which may be written fp dq, gives the whole area, or /. Connected with this is the criterion which a transforma- tion of the p's and q's must satisfy if it is to be a contact trans- formation: it can be proved that it is a transformation in which areas in the phase space are preserved, or are not affected by the transformation, though the shape of an area in the new coordi- nates may be very different from what it was in the old. An immediate result of this is that the J's are the same no matter what coordinates we may use for computing them. Angle variables can also be introduced in cases with several degrees of freedom, provided the motion is multiply periodic, by using a separate angle variable for each coordinate. It is evident that the method could not be used with motions which were not multiply periodic, for we have seen that it is only in the multiply periodic motions that there are quantities, as, for example, angular momenta, which stay constant. Yet the action variables, or J's, must stay constant, and consequently cannot be introduced, for example, in quasi-ergodic motions, where by hypothesis constants of the motion of this sort do not exist. We shall meet angle variables and phase integrals again in Chap. XXX, where it is seen that they have close connection with the quantum theory. In that theory, the phase integrals prove to be quantized; that is, they take on only discrete values, / being limited to integral multiples of a fundamental physical constant, Planck's constant h; and Eq. (7) for frequencies is replaced by an equation of finite differences, v being a difference of energy in two energy levels, divided by the corresponding 90 INTRODUCTION TO THEORETICAL PHYSICS difference of J (which can be simply h). These two formulas, which we elaborate later, form the basis of much of quantum theory. 60. Methods of Solution for Nonperiodic Motions. â€” When we meet a problem whose solution is quasi-ergodic, we are facing a branch of mathematics which offers no explicit or exact solu T tions. The only solutions are in the form of various series, methods, for instance by the method of perturbations, which can be used if the motion is almost multiply periodic. We indicated an example of this in discussing the two-dimensional oscillator, where we treated the problem as a Lissajous figure with slowly varying amplitudes and phases. In general, the method of perturbations consists in developing the various quantities which appear in the problem in power series in the small quan- tities measuring the deviation from the multiply periodic case. If, for instance, that case has been discussed by the method of angle variables, we regard the J's as slowly varying functions of time, their rate of variation being proportional to the first order to the magnitude of the perturbation. But in all these methods there is great difficulty in the matter of the convergence of the series; as time goes on, or as we consider larger and larger perturbations, they converge worse and worse, as is natural from the physical fact that often a slight change in initial conditions may, after the lapse of enough time, cause a profound change in the motion. These difficulties, as well as these methods of solution, are met particularly in celestial mechanics. Problems 1. Given a linear oscillator of mass m, frequency v, displacement x, momentum p, we can introduce a new coordinate w and momentum /, by the transformation x = s/j /2ir 2 mv cos 2irw p = â€” s/lmJv sin 2irw. This change of variables can be shown to be a contact transformation. Find the Hamiltonian in terms of the new variables, by substituting these values of x and p in the total energy. Show that this resulting Hamiltonian depends on J alone, being independent of w, and show that w is an angle variable. Verify that J is the phase integral, or area enclosed by the orbit in the phase space, and that v = dH/dJ. Show the geometrical interpre- tation of the contact transformation in the phase space. 2. An electron of charge â€”e, mass m, moves about a nucleus of charge Ze, and very large mass. The potential energy is â€”Ze*/r. Assuming Ihe energy to be E, angular momentum p g , separate variables, and consider PHASE SPACE AND GENERAL MOTION OF PARTICLES 91 the radial motion as a one-dimensional problem, as in Chap. VII. Take a two-dimensional phase space in which r and p r are variables, and plot the path of the representative point in this space. 3. Find the area of the path of the representative point in Prob. 2, and show that it is \/2ir 2 mZ 2 e 4 / ( â€” E) â€”2irp g . Set this equal to J r , the action variable connected with the radial motion. Find the energy in terms of J r , and by differentiation find the frequency of motion. Verify this result in the special case of circular motion, where you can compute the rotational frequency by elementary methods. 4. If F x = â€”ex, F y = â€”ky, prove by direct calculation that the force, regarded as a vector, is at right angles to the equipotential. Show that the force is not in the direction of the displacement. 5. Suppose in a two-dimensional oscillator that the force constants along the two axes are only slightly different from each other. Prove that the orbit resembles an ellipse, of slowly changing shape and size. (Hint: show that x = A cos (tot â€” a), y = B cos (cot â€” /3), where A, B, a, and /S are constants, is the equation of the ellipse. Then show that the equation of the path of the oscillator can be written in this form, if a. and /3 are slowly changing functions of time.) 6. A particle moves as if it were executing simple harmonic motion about the center of a turntable, and at the same time the turntable were rotating with uniform angular velocity. Compute the x coordinate of the particle as a function of time, and show that the motion is doubly periodic. 7. Sketch the orbits in Prob. 6, for several different ratios between the frequencies of oscillation and rotation, including some cases of irrational ratios, and also simple rational ratios, as 1/1, 1/2, 2/1. 8. A particle moving in two dimensions is attracted by two centers, of the same strength, attracting with a force proportional to the inverse square of the distance. Compute and plot a number of equipotentials, showing that for some energies the motion must be entirely confined to the region around one or the other center, while for larger energies it can surround both centers. 9. A particle moves in three dimensions under the action of a force of attraction to a center, depending only on the distance. Set up the problem in spherical coordinates, using the results of Probs. 3 and 4, Chap. VIII. Show that the variables can be separated, so that the problem is multiply periodic. Show that energy, total angular momentum, and the component of angular momentum along the axis of coordinates, all remain constant, showing the connection of these quantities with the generalized momenta of the problem. Using the obvious fact that the motion occurs in a plane and is just like two-dimensional central motion in that plane, show that the periods of the motions in and <f> are the same, so that the motion is only doubly, not triply, periodic. CHAPTER X THE MOTION OF RIGID BODIES In the preceding chapters we have been treating the mechanics of particles. Then we have passed on to the general methods of Lagrange and Hamilton, which can be applied to all sorts of mechanical problems. The present chapter will take up the motion of rigid bodies. In elementary work, one learns the main outlines of the problem of the motion of a rigid body. We know that its motion is a superposition of a translation and a rotation. There are two fundamental laws of motion: the force equals the time rate of change of linear momentum, and the torque equals the time rate of change of angular momentum. To make our ideas more precise, the translational motion generally refers to the motion of the center of gravity, and the rotational to rotation about the center of gravity. The motion of the center of gravity is essen- tially like the motion of a particle, which we have already treated. In order to leave that out in the present chapter, we shall assume that no net forces act, or that the body is pivoted, rotating about a fixed point. 61. Elementary Theory of Precessing Top. â€” A torque is a vector, equal in magnitude to the force acting times its lever arm (that is, the perpendicular distance from the center of rotation to the line of action of the force), and at right angles to force and lever arm. That is, in vector notation, the torque on a single particle is (r X F), where r is the radius vector to the particle, F the force acting, and the torque on the whole body is the vector sum of the separate torques on its parts. Similarly the angular momentum is a vector, defined in an analogous way: the angular momentum of a particle is equal in magnitude to the momentum times its lever arm, and at right angles to both, so that it is [r X (mv)], or m(r X v), and the total angular momentum of the body is the vector sum of the angular momenta of its parts. We see then that the equation "torque equals time rate of change of angular momentum" is a vector equation. This results in having two separate sorts of effect which a torque can produce. For we can analyze the torque into two components, one parallel 92 THE MOTION OF RIGID BODIES 93 to the angular momentum, the other at right angles. The first component of torque produces an increase or decrease of angular momentum in the same direction as the angular momentum already existing; that is, it produces a speeding up or slowing down of the rotation, or an ordinary angular acceleration. This is the effect seen in the speeding up or slowing down of wheels on fixed axles. The component of torque at right angles to the angular momentum, on the other hand, produces a rotation or precession of the angular mo- mentum vector, without change of length, and hence a change in the axis of rotation. This is the effect considered in the simple theory of the symmetri- cal top: if p represents the angular momentum of the top at a given instant (see Fig. 14), which is in approximately the same direction as the axis of figure of the top, the torque of gravity on the top will be mgl n â€¢ ., j i y . Fig. 14. â€” Angular momentum vec- sm 6 in magnitude, where I is tors for precessing top . The increm ent the distance from the point of of angular momentum dp, proportional .iii . c -j. to and in the same direction as the support to the center of gravity. torque of gravityt changea the total The torque Will act at right angular momentum from p to p + dp, angles to the axis and p, so ^^ " a P recession througb the that the change of momentum in time dt will be dp, as shown. Thus the angular momen- tum after time dt will be the vector p + dp, obtained from the old vector by a precession, as if the whole figure were rotated about the axis through the angle d<f>. We can easily find the rate of precession. For d<i> evidently equals dp divided by the dp radius of the circle, or is \p\ sin 9' On the other hand, dp = j . Q j. â€¢â€ž- mgl sin 9 dt . d<l> mgl mgl sm 9dt. Hence * , . â€” - â€” = d<$>, or -^ = j~, a precession increasing with increasing torque, but decreasing with increasing angular momentum. We note that if we regard the precessional velocity as a vector, say w, along the vertical direction, and having mgl a magnitude bl' we have 94 INTRODUCTION TO THEORETICAL PHYSICS | = (Â»x P ). - (1) This is a general relation for a precessing vector, as we readily see. The elementary ideas of torque and angular momentum do not permit us to go much farther than we have indicated here, without further analysis. With a body in the absence of torques, for instance, we know at once that the angular momentum stays constant, both in direction and in magnitude. But this tells us little about the actual complicated motion. We must then examine the problem more in detail. In the succeeding sections we consider the angular momentum, kinetic energy, etc., of solid bodies of arbitrary shape, with arbitrary axes of rotation, though we always assume that they rotate about a fixed point, as the center of gravity. 62. Angular Momentum, Moment of Inertia, and Kinetic Energy. â€” Let a body rotate about the origin as a center, the axis of rotation having direction cosines X, ,./*, v with the three axes. We may regard the angular velocity as a vector, whose direction is the axis of rotation, and whose magnitude is the magnitude of angular velocity. Thus, if the vector is co, its magnitude co , we have w x = Xco , <a y = /xco , w z = vca . Now we can easily find the linear velocity of any point of the body. This is numerically poj , where p is the perpendicular distance from the point to the axis of rotation, and is at right angles to the axis of rotation and the perpendicular distance. In other words, the velocity is given by the vector product (w X r); a little consideration shows that the vector product has the right direction. Now that we know the velocity of each point, we can compute the angular momen- tum. We have already seen that this is the sum of terms m(rXv) for all particles of the body. But v = oo X r, so that angular momentum = 2ra[r X (w X r)]. This can be easily expanded. Thus the x component, for example, is m\y(u X r), - z(co X r)â€ž] = m[y(o) x y â€” co y x) â€” z{w z x â€” u> x z)] = m<a x (y 2 + z 2 ) â€” m<*)yxy â€” ma> z xz, with corresponding formulas for the other components. If now we sum over all particles of the body, remembering that co is the same for all, we have, if p x , p y , p e are the components of angular momentum, THE MOTION OF RIGID BODIES 95 p x = Aa) x â€” Foiy â€” Eo) Zf p y = â€” Fo) x + B<ii y â€” Doi z , p z = ^Ecox â€” Da)y + Cd) 2 , (2) where for abbreviation we set A = Hm(y 2 + z 2 ) r B = 2ra(z 2 + z 2 ), C = Sm(i 2 + y 2 ), Z> = Zmyz, # = Zmzx, F = Zmxy. The quantities A, B, and C are called the moments of inertia, and D, E, F are the products of inertia; the first three are obviously the moments of inertia of the body in the ordinary sense, about the x, y, and z axes, respectively. We note one thing at the outset: the angular momentum vector is not in general parallel to the angular velocity vector. Thus if co y = co z = 0, so that the angular velocity is along the x axis, we have all three com- ponents of p in general different from zero. Next we find the kinetic energy. For a single particle, this is fmz> 2 , or |m(co X r) 2 . Again expanding, this is \m[{o3 y z â€” w g y) 2 + (o) z x â€” w x z) 2 + (a> x y â€” a) y x) 2 ] = hm[<* x \y* + z 2 ) + V(* 2 + * 2 ) + Â«. 2 (* 2 '+ V 2 ) â€” 2o3 x o} y xy â€” 2u) y o) z yz â€” 2<a z (a x zx]. Summing over all particles, and using the abbreviations above, this is T = \(Ao> x 2 + Boi y 2 + CV - 2Duy<a, - 2Ea> z a> x - 2Fa> x a, y ). (3) The quantity T can be written as %Ia>o 2 , where I = \ 2 A + n 2 B + v 2 C - 2fivD - 2v\E - 2\pF. ' (4) It is easily shown that / is simply 2wp 2 , where p, as before, is the perpendicular distance from the point to the axis of rotation, so that I agrees with the elementary definition. If we imagine X, fi, and v varied in any manner, the quantities A, B, . . . F do not change. As a variation in \ n, v means a variation of direction of the rotation axis through the center of rotation 0, we see that the sums A . . . F completely determine the moment of inertia of the body about any axis through the same center of rotation. 63. The Ellipsoid of Inertia; Principal Axes of Inertia. â€” The Eq. (4) for the moment of inertia / may be interpreted geometri- cally in a very simple manner. The equation Ax 2 + By 2 + 96 INTRODUCTION TO THEORETICAL PHYSICS Cz 2 â€” 2Dyz â€” 2Ezx â€” 2Fxy = constant represents a surface of second degree. If we denote by r the radius vector drawn from to a point on this surface, having direction cosines X, n, v, this equation becomes r 2 (A\ 2 + Bn 2 + Cv 2 - 2ZV - 2Ev\ - 2F\n) = constant. (5) The expression inside the parentheses is just I, the moment of inertia, so that we have r 2 = constant/7, or I = constant/r 2 . Now, since the moment of inertia is always positive and can never vanish, r 2 cannot become infinite and our surface is a closed surface. Since it is of second degree it is an ellipsoid with its center at 0, and is called the ellipsoid of inertia at the point O. The ellipsoid of inertia has the simple physical significance that the moment of inertia of the body about any axis through is measured by the inverse square of the radius vector from 0, drawn parallel to the rotation axis and terminating on the surface of the ellipsoid. Every ellipsoid has three principal axes which are mutually orthogonal. These axes are known as the principal axes of inertia at 0. Just as in the case of an ellipse, when coordinate axes are chosen coincident with the principal axes, the equation of the ellipsoid reduces to a sum of squares, so that the coefficients of the terms in yz, zx, and xy disappear and we have D = E = F = 0. We shall often use coordinate axes coincident with the principal axes, but since these axes are fixed with respect to the rigid body, we must always remember that they are rotating axes in space, and we must describe their motion with respect to a system of axes fixed in space. Referred to the principal axes, the moment of inertia becomes simply \ 2 A + n 2 B + p 2 C , where these A , Bo, and C are now computed with respect to axes fixed in the body, and so do not change with rotation of the body, as do the ordinary moments and products of inertia com- puted with respect to fixed axes. The kinetic energy of rotation is then T = i/co 2 = \(\ 2 oo 2 A + ^u 2 B + v 2 a 2 C ), which is also T = %(Aqui 2 + Bqwi 2 -f- CW3 2 ), where Â«i, co 2 , and co 3 are the components of co taken about the principal axes. 64. The Equations of Motion. â€” Suppose the moment, or tor- que, of the external force is M, with components M x , M y , M z . Then the equations of motion are obtained by setting the torque equal to the time rate of change of angular momentum: M â€”â– dp/dt, or for the x component - â– .â€¢â€¢-â– THE MOTION OF RIGID BODIES 97 M x = j t (Au x - Fco y - Eo> z ), (6) where, of course, we are using arbitrary x, y, z coordinates, not the principal axes. In performing the differentiation, we must remember that not only are a) x , a> y , co z changing, but also A, F, E, since the body is rotating, and these moments and products of inertia are defined with respect to a particular fixed coordinate system. Thus we have M x = Aw x â€” F6) y â€” E6o z + Ao> x â€” Fu y â€” Eu z . The last three terms can be rewritten, using, for instance, along with v = a> X r, so that x = o> y z â€” u> z y, etc. Without trouble we find that the equations can be written M x = A6> x â€” Fw y â€” Eu z â€” (B â€” C)co y co z â€” D(w y 2 â€” o> 2 2 ) + Fo}xO) z â€” Ecdx&y, (7) with equivalent equations for the y and z components. The latter terms seem very complicated ; but we readily see that they can be written as a vector product, giving M x =^ = A6> x - F6>y - E6> z + (co X p) x (8) The equation for time rate of change of angular momentum, in the form above, has a simple interpretation. Suppose we have any vector G, and that we consider it with respect to rotat- ing coordinates, rotating with the angular velocity co. If we were rotating with the coordinates, the vector would seem to have a certain time rate of change, which we may call dG/dt. But this will not be its actual time rate of change, when looked at from a stationary system of coordinates. For even a vector which remained constant in the rotating system would actually be changing, just on account of its rotation. In fact, the rate of change of the vector for this latter reason, using the same sort of argument which we met in Eq. (1) in describing the precessing top, is (o> X G), and the total rate of change of G is the sum of these two effects, or *-Â£ + <Â»*<Â»â€¢ W 98 INTRODUCTION TO THEORETICAL PHYSICS In particular, then, with the angular momentum, we evidently have two terms of the sort considered above. We conclude therefore that A(j) x â€” Fo3 v â€” Ed), = ( â€” â€” = \m)j so that these terms represent the rate of change of angular momentum, with respect to the rotating axes. One result of the theorem we have just worked out is interest- ing. Let the vector G be the angular velocity. Then doo/dt = du/dt, since the vector product (w X Â«) is zero. Hence the components of time rate of change of angular velocity are the same in fixed as in rotating axes. 65. Euler's Equations. â€” The equations of motion, (7) or (8)> take on a particularly simple form when expressed in terms of the principal axes. Let us first take our fixed axes xyz so that they coincide with the instantaneous values of the rotating, principal axes. Then D, E, F are instantaneously zero, and the equations (7) are M x = A Ux ~ (-So â€” Co)o)yO) z , with two similar equations. But now let coi, w 2 , o> 3 be the com- ponents of angular velocity with respect to the rotating principal axes. Momentarily these equal b I( u y , &> z . But also o>i is the same thing as (da)/dt) x , the x component of the time rate of change of angular velocity with respect to the rotating axes. We have just shown, however, that this equals (du/dt) x , or u x . Hence we can rewrite our equations entirely in terms of the moving axes, Mi = A 6>i â€” (Bo â€” Co)co 2 co 3 Mi = f?oO>2 â€” (^0 â€” A )cO 3 Wi Ms = C a> 3 â€” (A Q â€” jBo)wico 2 , (10) where Mi, Mi, M z are the components of torque with respect to the rotating axes. These equations are called Euler's equations. 66. Torque-free Motion of a Symmetric Rigid Body.â€” We shall now apply Euler's equations to the motion of a rigid body symmetric about an axis, subject to the action of no external torques (either the external forces are zero, or act at the center of mass). The earth provides a good example, if we neglect the torques due to sun and moon. We choose the center of mass THE MOTION OF RIGID BODIES 90 as an origin, and take the axis of symmetry as principal axis 3. The principal moments of inertia are then A . A , Co. Euler's equations for this case are A 6)i -f- w 2 g>3((7o â€” A ) = A O)2 + O}lO} 3 (A â€” Co) = C O>3 = 0. The last equation integrates at once, giving &> 3 = constant. This means that the resultant angular velocity has a constant com- ponent along the axis of symmetry. If we now place a = C â€” A cos - A j the two other equations are coi + aw 2 = and o> 2 â€” ouai = 0. Differentiating the first of these, we find oil + ao> 2 = oil + a 2 o)i = 0, which has as its solution cai = a cos (at + e), and putting this value of a>i in the second equation we find co 2 = a sin (at + c), where a and c are integration con- stants. From these equations we see that the resultant angular velocity <o = vW + o> 2 2 + Â« 3 2 = vV + Â« 3 2 is constant, and that the projection of a> on the plane perpendicular to the axis of symmetry and fixed in the body describes a circle of radius a with a period given by r = â€” = â€” ^ ^â€” In the case of a co 3 Co â€” ^.o the earth, Â« 3 = 2ir per day, so that r becomes A /(C Q â€” A ) days, which is about 300 days and is known as the Euler period. This period is not observed, but there is one of 427 days known as the Chandler period giving rise to a variation of latitude. When the imperfect rigidity of the earth is taken into account, it is possible to identify these two as the same. We can get an idea of the actual motion most clearly from a diagram. In Fig. 15, we show an oblate spheroid, to represent the symmetrical body. There is a circular conical hole Obd cut out surrounding the north pole a, and a fixed cone touching the inside of this hole, and centered on the line Oc. The motion is now as if one cone rolled on the other. We see at once that, since the axis Ob is instantaneously at rest, it is the instantaneous axis of rotation co. As time goes on, this axis of rotation traces out the cone Obd with respect to the body, and at the same time traces out the cone Obe fixed in space. The axis of the fixed cone, Oc, is the direction of the constant total angular momen- tum vector. Other properties of the motion are discussed in a problem. 100 INTRODUCTION TO THEORETICAL PHYSICS 67. Euler's Angles. â€” If we wish information about the general motion of the top, we must introduce some set of coordinates capable of describing its position. So far, we have not had any set of coordinates at all. We have worked with angular velocities, and angular momenta, which were vectors, and all the equations came out very neatly and symmetrically in terms of them. But there is a peculiar thing about the three components of angular velocity: there are no corresponding angles to serve as coordi- nates.. This is not true in plane motions. If a body rotates Fig. 15. â€” Space and body cones for the torque-free rotation of a symmetrical body. The cone Odb, fixed in the body, rolls on the cone Obe, fixed in space. The line Oa is the axis of symmetry of the body, Ob is the instantaneous axis of rotation, Oc the fixed axis of total angular momentum. with angular velocity u about a fixed axis, we can regard co as 0, where is the angle through which the body has turned about the fixed axis, and which can be used as a coordinate. Then we can say that the component of angular momentum Jw is the momentum conjugate to 0, and the whole Lagrangian and Hamil- tonian methods go through perfectly. As soon as we have three dimensions, however, and the possibility of different axes of rotation, we no longer have such angles. It is readily seen, for instance (we leave it for a problem), that one cannot use the angles through which the body has turned about the three coordinate axes as variables. The fact is that, though angular momentum is a vector, finite angular rotations are not, and do not have three components which can be used as coordinates. THE MOTION OP RIGID BODIES 101 We are forced, then, by the peculiar nature of angular rota- tions, to look for some set of three angles to describe the position of the body, which unfortunately cannot have the symmetrical nature of the x, y, z components of angular velocity. The usual set of angles are called Euler's angles, and are shown in Fig. 16. We ordinarily use these angles for discussing a symmetrical body. Then Oz is a fixed axis, for example, the vertical in the top problem. OC is the axis of figure of the body, taken as the Pig. 16. â€” Euler's angles. For a symmetrical body, OCo is the axis of sym- metry, OAo and OBo two axes fixed in the body at right angles. and </> measure colatitude and longitude of the direction of the principal axis; 4/ measures the rotation of the body about the principal axis. third principal axis. 6 measures the angle between axis of figure and fixed axis, <j> measures the angle of precession of the axis of figure, so that d<t>/dt is the angular velocity of precession, and if/ measures the rotation of the body about its axis of figure measured from the line ON, called the nodal line. Thus we see that, though the Eulerian angles do not have symmetry, they are very natural ones for the problem in hand. Let us set up the components of angular velocity, and the kinetic energy, in terms of the Eulerian angles. The motion of the body may be thought of as consisting of a rotation of the body about OC and the motion of OC relative to the fixed frame of reference. The former is described by the angular velocity 4> which has the components 0, 0, ^ (referred to the 102 INTRODUCTION TO THEORETICAL PHYSICS principal axes). The latter motion consists of (a) a rotation 6 about the nodal line ON as an axis, which was zero in the steady- motion of the top considered above; and (6) of a precession <f> about the z axis. The components of these angular velocities along the principal axes OA , OB , and OC are (a) 6 cos \p; â€” 6 sin yp; (6) <$> sin sin $; <j> sin cos ^; <j> cos 0. Adding these angular velocity components, we have coi = cos ^ + 4> sin sin \p co 2 = â€” sin \f/ + sin cos i/' W3 = tj/ -\- <f) cos 0. Here $ corresponds to the quantity a) used for discussing the rcteady motion, and 4> to Â«i. From these components of angular velocity, of course, we can at once get the angular momenta. The kinetic energy, as we have seen, is i(A wi 2 + # a> 2 2 + C co 3 2 ). But in our case of a symmetrical top this simplifies, since A = B , and substituting we have T = i[A (Â«iÂ» + w 2 2 ) + C co 3 2 ] = iUo(0 2 + sin 2 6 tf) + C (^ + 4> cos 0) 2 ]. (11) Using the kinetic energy, or corresponding Lagrangian function, in terms of the Eulerian angles, we can easily derive the Lagrang- ian equations of motion, and find them to be the same Euler equations which we have already obtained. For instance, using L = T, |(|) - % = |[Ctf + + cos Â«)] = Câ€žco 8 = M 3 , which is the third of Euler' s equations, when we remember that A Q - Bo = 0. 68. General Motion of a Symmetrical Top under Gravity â€” We are now ready to proceed with the general discussion of the top under gravity, for which we have already considered the steady precession. We note first that the torque is at right angles to the axis of figure. Hence by the third of Euler's equations, co 3 = 0, or Â« 3 is constant. Instead of using the other two of Euler's equations, it is somewhat more convenient to use the conservation of energy and of angular momentum to discuss the motion, much as we did in our earlier chapter on central motion. For the kinetic energy we have T = %[A (P + THE MOTION OF RIGID BODIES 103 sin 2 <Â£ 2 ) + C co 3 2 ], and the potential energy is Mgl cos 0, where I is the distance from to the center of mass. Thus the energy- equation becomes E = UMO 2 + sin 2 0<Â£ 2 ) + CW) + Mgl cos 0, (12) where i? is the total energy. We now can eliminate <f> from the equation above by utilizing the fact that there are no torques taken about the z axis. This means that the component of angu- lar momentum along this axis is constant. The angular momen- tum due to the rotation o> 3 of the top about its axis of. symmetry has a vertical component C w 3 cos 0. The angular velocity of the axis contributes nothing to the vertical angular momentum. The other component of the angular velocity of the axis is sin <f>, and this is about an axis perpendicular to OC, making an angle of t/2 â€” with the vertical. Thus the contribution of this term is A sin 2 00, so that the conservation of angular momentum about the z axis yields Vz = C0W3 cos + A sin 2 0<Â£. (13) We now substitute the value of <f> taken from this equation into the energy equation and get a differential equation for alone, so that we may discuss the time variations of 0, or the variations of the inclinations of the axis of figure of the top with the vertical. When we make this substitution, and solve for 0, we have = V2(E - V'y/Ao, (14) where V, which plays the part of a fictitious potential energy for the motion of this coordinate, has the value The first term is the gravitational energy, decreasing as the angle increases, showing that gravity tends to make the top fall. The second is a constant, the energy of the spinning motion. The third term is a dynamic term, reminding us of the centrifugal force term in the effective energy for the radial motion in a central field. It becomes infinite when = or t, since at those angles the rate of precession <j> would have to be infinitely rapid in order to conserve the angular momentum component p e , contributing therefore an infinite amount to the energy. Between these angles this dynamic term has a single minimum. In other words, it exerts a stabilizing influence, quite apart from 104 INTRODUCTION TO THEORETICAL PHYSICS any external forces which may act, and leads to a stable oscilla- tion of about a certain minimum of V, whose position is determined by the external torque. 69. Precession and Nutation. â€” The minimum of V can be determined by differentiating with respect to 0, setting the result equal to zero. This gives = -Mgl&m + (^-CoCO3COS0) | CoCO3 gin Q _ Aa cog e gin e 'i Pz â€” C0W3 cos 6 A sin 2 } A sin 2 or, from Eq. (13), <j>[C rp - (A - C )<j> cos 0] = Mgl. (16) If the energy is equal to the effective potential V at this angle, will be zero, and the motion is a pure precession of the sort described in Sec. 61. If we assume that the rate of precession is small compared with the rate of rotation, which is the only case in which the angular momentum, the angular velocity, and the axis of fig- ure are nearly enough in the same straight line so that the arguments of that section are valid, we have <j> < < ip. In that case the equa- tion becomes 4>(C4d = Mgl, <j> = Mgl/Ctfp, m agreement with the re- sult of Sec. 61, when we recall that in this limit Cot is approximately the total angular momentum. This condition, or rather the accurate con- dition (16), determines the rate of steady precession <f> for any total sphere. 0i and 2 are angular angular momentum, a rate independ- limits of the mutational motion. ^ of Q t() thig approxima tion, but depending on if we must consider terms in <j> 2 . If the energy E is greater than the minimum of V , the curve of E will cut that for V at two values of 6, one greater and one less than the inclination of the axis for the purely precessional motion which we have just discussed. In this case, will oscillate between these two limits. This oscillation is called nutation. The complete motion then consists of a combination of this nutation with a precession, as indicated in Fig. 17, where we draw Fig. 17. â€” Nutation of a top. The sinusoidal curve is the pro- jection of the axis of the top on a THE MOTION OF RIGID BODIES 10 [) the intersection of the axis of the top with a sphere. The angles 0i and 2 are the two angles for which E = V, so that thi; minimum of V, or the angle for the pure precessional motion corresponding to the same angular momentum, lies between these two values. In the problems, the frequency of the nutational motion is discussed. We also discuss, in Prob. 9, the special case of the "sleeping" top, in which the top starts spinning vertically. In this special case, the dynamic term in V is finite at 6 = 0, so that under certain circumstances oscillations about the vertical can occur. Problems 1. Prove directly that the moment of inertia /, equal to Swp 2 , is equa] to \ 2 A + n 2 B + v 2 C - 2nvD - 2v\E - 2\ M F, where X, ft, v, are the direc- tion cosines of the axis of rotation. 2. Show that, if T is the kinetic energy of a rotating body, p its angular momentum, w its angular velocity, p x = dT/dw x , and 2T = p x w x + p yWy + PzUz. 3. In Fig. 15, show that tan AaOb = a/m, and tan AaOc = =-Â° â€” , where Co 0>3 a>s, a represent the components of the angular velocity along and at right angles to the figure axis. Knowing that the time required for the axis 06 of angular velocity to perform a complete rotation with respect to the body 2x Ao , . . . 18 T = &T C â€” A ' S * time for rt to P er f orm a complete rotation in *? A space is approximately â€” ^ if angles aOb and aOc are small. Hence show 0>3 <^0 that for the earth the axis of angular velocity is not fixed, but rotates about a fixed direction approximately once a day. 4. The earth is acted on by torques exerted by the sun and moon, and as a consequence its angular momentum precesses about a fixed direction in space. This is entirely separate from the effect of Fig. 15 and Prob. 3, which we now neglect. This precession has a period of 25,800 years, and carries the angular momentum about a cone of semi-vertical angle 23Â° 27', so that the pole in succession points to different parts of the heavens, result- ing in the precession of the equinoxes, and in the fact that different stars act as pole star at different periods of history. Show that the motion can be represented by the rolling of a cone fixed in the earth, of diameter 21 in. at the north pole, on a cone of angle 23Â° 27' fixed in the heavens. 6. A system of electrons moving about a center of attraction has a certain angular momentum, equal to 2ra(r X v), and also a magnetic moment, equal to ^^(r X ^' where e is the charge and m the mass of an electron, c the velocity of light. This magnetic effect results because the electrons in rotation act like little currents, which in turn have magnetic fields like bar magnets. An external magnetic field H exerts a torque on the system, equal io thd vector product of the magnetic moment and H. Show that ^ DEPARTMENT OF CHEMISTRY LIVERPOOL COLLEGE OF TECHNOLOGY 106 INTRODUCTION TO THEORETICAL PHYSICS under the action of the field, the system of electrons precesses with angular velocity eH/2mc about the direction of the field. This precession, which, as we see, is independent of the velocities of the electrons, is called Larmor's precession. 6. One reason why finite rotations do not act as vectors is that they do not commute, that is, the same two rotations applied in one order lead to one answer, but in the opposite order to a very different answer. Demonstrate this by diagrams, imagining that we have a cube (label its faces by different letters or numbers on a diagram), originally in one position, with its edges parallel to the coordinate axes (position a). First rotate through 90 deg. about the x axis (position 6), then through 90 deg. about the y axis (position c), drawing diagram^ of each step. Then, starting again from position a, rotate first through 90 deg. about the y axis (position d) and then through 90 deg. about the x axis (position e). . Show that (c) and (e) are entirely different orientations. 7. Write down the kinetic energy of a nonsymmetrical body, in terms of Euler's angles. Derive the Lagrangian equation for \p, and show that it reduces to one of Euler's equations. 8. In the same way as in Prob. 7, set up the other two Lagrangian equa- tions, showing that they lead to the other two of Euler's equations. 9. A top is started spinning vertically, with no other motion, so that initially 0=0, dd/dt = 0. Show that p z = CWa, E = Â£CW 3 2 + Mgl. Sub- stituting these in the expression of Eq. (14) for 0, show that if w 3 > w', where ( w ')2 = 4Mgl Ao/Co 2 , the angle must remain equal to zero, but that if Â« 3 falls below a', will oscillate between and the angle cos -1 (2(w 3 /Â«') 2 â€” 1]. Experimentally, if a top is started as we have described, with Â« 3 > w', there will be a frictional torque decreasing w 3 , and as soon as the torque reduces 0)3 below a/, the top will begin to wobble. 10. For a nutation of small amplitude about the steady precessional motion of a top, the angle oscillates sinusoidally about the equilibrium angle. Find the frequency of the nutation, by expanding the potential V in power series in - 0o, where O is the angle of steady precession with the same angular momentum. Retain only the constant and the term in (6 â€” 0o) 2 , and get the frequency by comparing with the corresponding expression for the linear oscillator. CHAPTER XI COUPLED SYSTEMS AND NORMAL COORDINATES The mechanical problems which we have treated so far have been those where just one particle moved around, sometimes in a potential field, sometimes subject to forces not derivable from a potential. In many problems, however, there are several particles exerting forces on each other and influencing each other's motion. As examples, we have the actual solar system, where the sun, planets, and moons all act on each other; an atom, with the various electrons reacting; a molecule, with the atoms vibrat- ing under the action of their mutual forces. A more familiar case is that of several electric circuits coupled together by induction or some other method. Another is that in which several pendulums or springs can react on each other, as through their supports, and affect each other's motion. There is evi- dently a very wide variety of problems; we shall treat only the simplest, in which two linear oscillators, or electric circuits, are coupled together by a force depending linearly on both' the displacements. 70. Coupled Oscillators.â€” Suppose we have two undamped one-dimensional oscillators, whose displacements are y x and y 2 respectively, and whose equations of motion, if uncoupled would be mx ~w + kiyi = Â°Â» m 2 -j^ + k 2 y 2 = 0. Now let them be acted on by equal and opposite force* propor- tional to the distance apart, -a(y x - y 2 ) and -a(y 2 - Vl ) respectively, as if there were a spring stretched between them' 1 he equations then become mx ~W + ( kl + a )2/i - a v* = o, mT W + ^ 2 + a )?/2 - ay x = 0. 107 108 INTRODUCTION TO THEORETICAL PHYSICS As a matter of convenience in the calculation, we shall introduce changes of notation: let yis/m~i = x h y 2 \/m 2 = x 2 , (&i + a)/m 1 = coi 2 , (fc 2 + a)/m 2 = co 2 2 , a/Vmm 2 = c. Then the equations are â€” ^ + COi 2 ^! - 'cx 2 = 0, ^? + co 2 % 2 - czi = 0. (1) These are two simultaneous differential equations, and there are several ways of solving them. First we may take advantage of their property of being linear with constant coefficients, and see if we cannot get exponential solutions. We assume x x = Ae iut , X2 _ Be iut , where A, B, co are to be determined. Substituting, we have (-CO 2 + C0! 2 M - cB = 0, -cA + (-co 2 + co 2 2 )J5 = 0. (2) If we regarded co as being known, these would be two simultane- ous equations for the two constants A and B. Evidently they are linear homogeneous equations. Now it is a theorem of algebra that in general two such equations do not have any solutions, unless the determinant of coefficients, (-CO 2 + CO! 2 ) ~C â€” C (-0) 2 + co 2 2 ) (3) is equal to zero. Let us see what this means. We could solve the first equation for A in terms of B: A = Bc/(-a) 2 + coi 2 ). But we could do the same with the second, A = B{- co 2 + o> 2 2 )/c. If these solutions are to be consistent, it must be that the two factors on the right are equal, c/(-co 2 + coi 2 ) = (-co 2 + co 2 2 )/c, or (-0,2 .f ^^(-^ + W2 2) _ c 2 = o. But this is just the equa- tion obtained by setting the determinant equal to zero, so that we have verified the result of algebra. Now the equation which we have obtained, called the secular equation, can be satisfied, for we still have co at our disposal. Solving the quadratic, this gives ^ _ oMHp* Â± ^ (Â».Â» - ^y + cK (4) This gives two values for co 2 , or two different possible frequencies of motion for the system. This is natural, since we should have COUPLED SYSTEMS AND NORMAL COORDINATES 109 two frequencies if they were uncoupled, one for the one particle, the other for the other. Suppose the first, with the + sign, is called a/, and the second, with the â€” sign, co". It is interesting to find co' and co", in the case where c, measuring the interaction between the particles, is small. Then we can expand by the binomial theorem, obtaining "' 2 = CO! 2 + 2 ^ 2 + â€¢ â€¢ â€¢ , Uf â€” 0> 2 C/' 2 = C0 2 2 + -y^ 2 + â€¢ â€¢ â€¢ , (5) 0>2 â€” COi showing that the frequencies approach the natural frequencies of the separate systems when the coupling goes to zero, but that they differ from them by quantities which increase as c increases. It is interesting to see that the frequencies are always spread apart by the interaction: if coi 2 > co 2 2 , then a/ 2 > co x 2 , co" 2 < co 2 2 , and correspondingly if the situation is reversed. There are several relations between co' and co" which we shall need, and which we write for reference; they are easily proved from the solutions already found, and hold independently of the size of c : Co' 2 Co" 2 = Â£Oi 2 C0 2 2 â€” C 2 <0'2 _f_ w "2 = Wl 2 + ^ (-a/ 2 + cox 2 ) (-a/' 2 + cox 2 ) = -c 2 . (6) Having determined the two possible frequencies of vibration of the system, we next find the amplitudes A' and B' correspond- ing to co', and A" and B" corresponding to co". These are evi- dently given by 4L = c B' (-co' 2 + COl 2 / *L = 1 (7) B" ("CO" 2 + CO! 2 ) ^ } That is to say, the ratios of A's to B's are determined, but not the values themselves. The situation is then the following: we have one possible solution, x x = A'eâ„¢' 1 , x 2 = B'e^' 1 , where the ratio of the amplitudes of x\_ and x 2 is fixed, but the magnitudes are otherwise arbitrary. Of course, there is a similar solution with â€” ico't in the exponent, so that combining these in the usual way we have an arbitrary phase and amplitude, or two arbitrary constants. Next we have also the solutions .Ti = A"e iu "\ r 2 = B"e io,/ ", of the same sort. And now, on account of the linear 110 INTRODUCTION TO THEORETICAL PHYSICS nature of the equations, we can make linear combinations of these, obtaining X! = A'eâ„¢'* + A"e ia " 1 , x 2 = 5V-" + B"e*"*. (8) That is, each coordinate has two periods in its motion, or is doubly periodic. Since the amplitudes are to a certain extent arbitrary, it is possible for only one frequency to be excited at a time, or for both to go simultaneously. It is interesting to consider the physical nature of the motions described by these equations. Let us assume that the two sys- tems are only loosely coupled together (c is small). Then one possible mode of vibration has frequency co', only slightly greater than the frequency coi which the first oscillator would have had without coupling. It is not a vibration of the first oscillator alone; both are vibrating at the same time. However, if we examine the coefficients A', B' in this case, we find that B' is small compared with A', meaning that the amplitude of the second oscillator is small compared with that of the first. Thus, using B'/A' = W - o> ,2 )/c, and co' 2 = o>i 2 + c 2 /W - co 2 2 ) + â€¢ â€¢ â€¢ , we have approximately B'/A' = c/(co 2 2 â€” coi 2 ). This is as if the first oscillator, vibrating with frequency co', which is approximately co x , and amplitude A', were forcing the second oscillator by virtue of the coupling, with a force cx h or cA'eâ„¢' 1 , or approximately cAV* 1 '. This would produce a forced ampli- tude of (cA' e ic "'0/(Â«2 2 â€” u' 2 ), which is just what we have found. Similarly the second oscillator can vibrate almost by itself, with the frequency co" which almost equals co 2 , but it reacts back on the first and produces a small forced amplitude. It is now in the further approximations to the interaction that the differences between coi and co', co 2 and co", come in. We have considered the types of vibrations separately. But there is no reason why both cannot be simultaneously excited, so that each particle will be vibrating with both periods at once. Then the phenomenon of beats can easily come in; for the sum of two sinusoidal vibrations of different frequencies is equivalent to a single vibration of varying amplitude, as we see from the equation / w / _ w " \ â€ž' + co", cos o't + cos a"t = (2 cos ^ t ) cos 2 ' where the first expression, in parentheses, represents an amplitude COUPLED SYSTEMS AND NORMAL COORDINATES 111 oscillating with the slow frequency (Â«' â€” u")/2, and modulating the latter term, a rapid vibration of frequency (&>' + <o")/2. If o}' and o}" are approximately equal, the effect gets most marked, the frequency of the beats approaching zero. There is in this case a pulsation of amplitude and energy from one of the oscillators to the other. This is often seen in other similar problems. Thus, if a weight is hung from a spiral spring and is set vibrating up and down, it will be observed that after a certain lapse of time the vertical motion will decrease, but there will be a torsional motion of considerable amplitude. As time goes on, these two forms of motion will alternately take up large ampli- tudes. The reason is that there is a coupling between the two forms of oscillation, and the beat phenomenon we have just described comes into play. 71. Normal Coordinates. â€” We have just seen that the general oscillation of two coupled particles is a sum of two vibrations of different frequencies. If only one of these vibrations is excited, both particles oscillate with the same frequency but different amplitudes. It now proves to be possible to introduce new coordinates X and F, called normal coordinates, given by linear combinations of the displacements xi and x 2 of the two particles, which have the following properties: the generalized force acting on X is proportional to X alone, independent of F, so that the equations of motion are separated, and X and F execute independent simple harmonic vibrations, of different frequencies. When one of the coordinates alone is different from zero, the other remaining equal to zero, just one of the two vibrations is excited. The existence of such coordinates is made plausible by the following fact: if one vibration alone is excited, x x is proportional say to a times a sinusoidal function of time, x 2 to j8 times the same sinusoidal function. In this case jSzi â€” ax 2 will be always zero. This linear combination of xi and x 2 will be proportional, then, to the normal coordinate associated with the second type of vibration, which is not excited in the case men- tioned. By assuming that the second vibration alone is excited, we can in a similar way infer the form of the first normal coordi- nate. We proceed in the next paragraph to the general formula- tion of the normal coordinates. Suppose we set up quantities X, F, defined by the equations Xi = a'X + a"Y, x 2 = 0'X + 0"Y, (9) 112 INTRODUCTION TO THEORETICAL PHYSICS where Co! = A', Cfi f = B', Da" = A", Dp" = B", C and D being constants. Since only the ratios of the a's and /3's are so far determined, we may demand that the magnitudes be so fixed in this case that a' 2 + 8' 2 = 1, a" 2 + 6" 2 = 1. This is called the condition of normalization, and we shall see its signifi- cance a little later. Our quantities X and Y can now be treated as generalized coordinates, and we can easily see that the equa- tions of motion, in terms of them, have the variables separated. Let us set up the equations of motion in these new variables. We have 2 = 1 ~ 2 = 1 ~ 2 Using the relations (6) and (7), the last term can be shown to be zero. This is called the condition of orthogonality, for reasons which will later be evident. Using the normalization conditions mentioned above, we have finally T = - 1 2 \dt) + \dt) (10) Next for the potential energy we have, from the original equations, V = K"iW + co 2 2 X2 2 - 2cxix 2 ), = M(Â«i 2 (V* + oc"Y) 2 + co 2 2 (/3'X + p"Y) 2 -2c(a'X + a"Y)(8'X + 8"Y)} = \ { ( Wl V 2 + o>2 2 /3' 2 - 2ca'8')X 2 + ( Wl V 2 + co 2 2 /3" 2 - 2ca"8")Y 2 + 2Wa'a" + a> 2 W - c(a'jS" + a"8')]XY). Here it can be shown by a little manipulation that the first parenthesis equals w' 2 , the second co" 2 , and the third is zero, so that v = K*>' 2 ^ 2 + Â«" 2 F 2 ). ( n .) In terms of the new variables, the variables are separated, and COUPLED SYSTEMS AND NORMAL COORDINATES 113 Lagrange's equations become simply d 2 X/dt 2 + <a' 2 X = 0, d 2 Y/dt 2 + <a" 2 Y = 0, whose solutions are X = constant X e iu/t , Y = constant X e ia " 1 . Thus each of the generalized coordinates executes a simple harmonic motion, which of course can have arbitrary amplitude and phase, and our final result, if we set the first constant equal to C, the second to D, is xi = a'X + a"Y = a'iCe^' 1 ) + a"(De*"'"') = A'e^" + A'V-'", etc., agreeing with the results already found. It may be proved in general that for any mechanical problem in which the potential is a quadratic function of the coordinates, coordinates of this kind (called normal coordinates) can be set up, having the property that they have no cross terms between different coordinates in either the kinetic or the potential energy, so that the Lagrangian function is a sum of squares of coordinates and velocities, with constant coefficients, and the variables are separated in the Lagrangian equations. The general method of setting up these normal coordinates follows exactly the model we have found for our simple problem. This is one of the few sorts of mechanical problems in which a general solution is possi- ble, for no such theorem holds with other laws of force. The equations of motion for the normal coordinates are just like those for harmonic oscillators, so that their solutions are sinusoidal vibrations. In general, there are then as many fundamental periods in the motion as there are constants, so that the motion is multiply periodic. The normal coordinates are of particular value when we come to discuss the action of external forces on the coupled systems. For suppose there are external forces F x and F 2 acting on the two particles respectively, in addition to the elastic forces already considered. Then we can set up the generalized forces acting on the two normal coordinates, by the method described in Chap. VIII. If these are F x and F Y , we have Fx = Fl JX + F ^ = a ' Fl + ^ 2 ' F r = F^ + F^ = a"F, + 0"F t . Then the equations of motion are simply 114 INTRODUCTION TO THEORETICAL PHYSICS d 2 Y w + Â°" 2Y - F Y, (12) showing that these normal coordinates have the same sort of equations of motion, under the action of external forces, as single oscillators. Thus the complete solution will be a sum of a partic- ular solution of the inhomogeneous equations, consisting of vibrations of the same nature as the external force, capable, therefore, of showing resonance phenomena, and of a general solution of the homogeneous equations, of the sort we have found. Fig. 18. â€” Rotation of coordinates. The distances OA and PA are the x and y coordinates of the point P, and OB and PB are the X and Y coordinates. Under certain circumstances, a damping force proportional to the velocity will also be expressed in terms of normal coordinates as a constant times the time rate of change of the normal coordinate, but this is not always true. We shall discuss this question in Chapter XIII. 72. Relation of Problem of Coupled Systems to Two-dimen- sional Oscillator. â€” Our problem of two coupled one-dimensional oscillators reminds us strongly of the case of two-dimensional oscillators encountered in Chap. IX. Here, as there, we have two coordinates (xi and x 2 here, x and y there), and linear restor- ing forces. But the difference is that here the restoring force acting on each coordinate depends on the values of both. The corresponding problem in the two-dimensional case would be that where F f = â€” ax -f- cy, F y = â€” by + ex, where a â€” Â«i 2 , COUPLED SYSTEMS AND NORMAL COORDINATES 115 6 = w 2 2 . And obviously the problem can be solved just as we have treated our case of the coupled oscillators. That is, we introduce new variables X, Y, defined by the equations x = a'X + a"Y, y = p'X + 0"Y, where the a's and jS's have the values found above, and in terms of the new variables X and Y we have separation, and get a solution in which X and Y execute periodic vibrations of different frequencies. But now we can get a very simple geometrical interpretation of our change of variables: it is merely a rotation of coordinates. To see this, let us first consider what a rotation of coordinates means analytically. In Fig. 18, we see old coordinates xy, and new, rotated ones, XY. The xy and XY coordinates of a point P are indicated. Now there is a very simple vector way of writing the coordinates. Let i, j be the unit vectors along x and y, respectively, and /, J along X, Y. Further, let r be the radius vector from the origin to point P. Then evidently we have x = (i â€¢ r), y = (j â€¢ r), X = (J â€¢ r), Y = (J â€¢ r). But we can express i and j in terms of I and J, or vice versa : i = (* â€¢ /)/ + (t â€¢ J) J, j = (i â€¢ i)i + (j â€¢ J) J. Hence we have x = (i â€¢ r) = (i â€¢ I) (I â€¢ r) + (i â€¢ J)(J â€¢ r) = {i-I)X + (i-J)Y, (13) and y = (j â– I)X + (j â€¢ J)Y. These are linear equations of just the sort already found and agree if (t ' I) = Â«', (*' â€¢ J) = Â«", U-I)=P', (j â€¢ J) = 0". (14) We may not assume, however, that any linear transformation of this sort corresponds to a rotation; the general transformation would be to a stretched, oblique set of axes. For the new coordi- nates to be obtained from the old by merely rotating, we must have two conditions: (1) the vectors / and / must be at right angles, or orthogonal, to each other; (2) I and J must be of unit length, or, as we say, normalized. That is, in vector notation, 116 INTRODUCTION TO THEORETICAL PHYSICS (I â€¢ J) â€” 0, P = J 2 = 1. Now we can express these equations by taking components along the x, y axes: since I = (i â€¢ I)i + j.j = o= (i-i)(i-J) + U'i)U-J) = a'a"+W" ( (15) or the orthogonality conditions, which, we have already seen to be satisfied, and whose significance we now see. Also p = i = (i . 7)2 + (j â– iy = a ' 2 + P' 2 J2 = 1 = Â«"* + p"\ (16) or the normalization conditions, which we satisfied by proper choice of arbitrary constants. We can, in conclusion, make the following statement: any linear transformation in which the transformation coefficients satisfy the orthogonality and normali- zation conditions corresponds to a rotation of coordinates. The advantage of making our rotation is seen when we con- sider the mechanical problem. In the original problem we have force components F x = -ax + cy, F y = -by + ex. We can find the components of force in the new variables. Evidently F x = (F- i), F y = {F- i), and similarly F x = (F'l) = (F- i)(i â– J) + (F â€¢ j)(j â– I) = <*'F X + -pF, = a'(-ax + cy) + P'(-by + ex) = -(a'a - P'c)x - {-a'c + P'b)y ' = - (a'a - fic)WX + a"Y) - (-a'c + f?b){fiX + fl"Y) = -(a' 2 a - 2a'P'c + P' 2 b)X - (a' a" a - a"p'c - a'&'c + 0'P"b)Y. But by results already proved, we easily see that the first paren- thesis equals co' 2 (or a corresponding expression in terms of a and 6), and the second is zero, so that F x = -o/ 2 X, and similarly F Y turns out to be -w" 2 Y. In other words, by this rotation of axes, we have got each component of force to depend on displace- ment in that direction alone. Incidentally, the method of finding the components of a vector in rotated coordinates which we have used is of general application. The object of the rotation becomes even clearer when we consider the potential energy. This is the quantity whose x derivative is ax - cy, and y derivative is by - ex. First we note that dFJdy = dFJdx = c, so that the curl of the force is zero, and the potential exists. Then we easily see that V = COUPLED SYSTEMS AND NORMAL COORDINATES 117 \{ax 2 + by 2 â€” 2cxy), or ^(coi 2 x 2 + W2 2 ?/ 2 â€” 2cxy). An equi- potential, obtained by setting this expression equal to a constant, is an ellipse with its center at the origin, but with its major and minor axes inclined at an angle to the xy axes, unless c = 0. But now we have seen that the potential in the new coordinates has the expression V = |(a/ 2 X 2 + &/' 2 F 2 ). If this is equal to a constant, the result is the equation of an ellipse whose principal axes are along the X and Y axes. In other words, our whole change of variables has been a rotation of the coordinate axes to point along the principal axes of the elliptical equipotentials. The process of rotating axes to coincide with the principal axes of an ellipse or ellipsoid is a common thing in mathematical physics. We have already seen one example in the last chapter, where we had the ellipsoid of inertia, and used the principal axes as coordinates. Other illustrations come from the theory of elasticity, where there is an ellipsoid of stress at each point, and we often use the principal axes of stress as coordinates. Again, in wave mechanics, examples of the same sort of process are constantly found. 73. The General Problem of the Motion of Several Particles. â€” The present problem is the first one we have met in which there are several particles interacting with each other, and it has illus- trated one of the useful methods of attack on such a problem. This is to take all the coordinates, whether they refer to one or another particle, and imagine them all plotted in a many- dimensional space, like the phase space which we discussed in connection with the Hamiltonian method, but with only enough dimensions to take care of coordinates, not of momenta. Such a space is often called a configuration space. Then the motion of the system is given by the motion of a point in configuration space. If there is a potential, it is a function of position in configuration space. We can then apply many of the same ideas to the motion of the point in many-dimensional space that we would to the motion of a single particle in three-dimensional space. Thus there will be parts of configuration space where E â€” V is positive; there the point can go, but it cannot enter the regions where E â€” V is negative. In some cases, changes of variables in configuration space can simplify the problem enough so that we can separate variables, or at least go far toward a solution. The present chapter has supplied one instance. Another is found in the problem of two particles, as 118 INTRODUCTION TO THEORETICAL PHYSICS the earth and sun, exerting forces on each other but not being acted on by outside bodies. There we can introduce new coordi- nates: first, the three coordinates of the center of gravity of the system; second, the coordinates of one particle relative to the other. And in terms of these new coordinates, the three coordi- nates of the center of gravity become separated from the others, resulting in a uniform motion of the center of gravity in a straight line, and the relative motion reduces to a problem mathematically equivalent to the motion of a single particle in three-dimensional space. The changes of variables used in these cases generally have the property, which we have noted in the present case, of mixing up the coordinates of two or more particles in a single generalized coordinate. Problems 1. Two balls, each of mass m, and three weightless springs, one of length 2d, the others of length d, are connected together in the arrangement spring d â€” ball â€” spring 2d â€” ball â€” spring d, and the whole thing is stretched in a straight line between two points, with a given tension in the springs. Grav- ity is neglected. Investigate the small vibrations of the balls at right angles to the straight line, assuming motion only in one plane. Show in general that there are two modes of vibration, one having the lower frequency, in which both balls oscillate to the same side at one time, then the other, and the second mode, with higher frequency, where they oscillate to opposite sides. (Hint: if the first is displaced xi, and the second x 2 , and if these displacements are so small that the tension t is unchanged, then there will be two forces acting on the first ball: a force t toward the point of support, making an angle whose tangent is Xi/d, and another directed toward the second ball, at an angle whose tangent is {x 2 â€” Xi)/2d. The component at right angles to the straight line, and thus producing the motion, is then â€”xi (t/d) + (#2 â€” xi) (t/2d). Similarly the force on the second is â€”x 2 (t/d) + (xi - x 2 )(t/2d). 2. Assume two resistanceless circuits, one with Li, Ci, the other with L 2 , C%, coupled together by having a mutual inductance M between the two inductances (that is, back e.m.f . of self- and mutual inductance is â€” Li dii/dt â€” M dii/dt in the first circuit, and â€” L 2 dii/dt â€” M dii/dt in the second circuit, where i\, i% are the currents in the circuits). Find the frequencies of the natural oscillations of the coupled system. 3. In Prob. 2, assume that the circuits have small resistances R i and R 2 , respectively, so small that the logarithmic decrements of the separate cir- cuits are small. Discuss the damped oscillations, showing that the solution can be carried out if squares of resistances are small enough to be neglected, but that it leads to a biquadratic equation for the frequency for large R. (Hint: write the frequency as the sum of a real and an imaginary part.) S 4. Two identical pendulums hang from a support which is slightly yield- ing, so that they can interchange energy. Assume that coupling is linear. Now suppose one pendulum is set into motion, the other being at rest. COUPLED SYSTEMS AND NORMAL COORDINATES 119 Show that gradually the first pendulum will come to rest, the second taking up the motion, and that there is a periodic pulsation of the energy from one pendulum to the other. Show that the frequency of this pulsation gets smaller as the coupling becomes smaller, until with an infinitely rigid support the energy remains always in the first pendulum (this is all without damping forces). 5. One simple pendulum is hung from another; that is, the string of the lower pendulum is tied to the bob of the upper one. Discuss the small oscillations of the resulting system, assuming arbitrary lengths and masses. Use the angles which each string makes with the vertical as generalized coordinates. In the special case of equal masses and equal lengths of strings, show that the frequencies of the motion are given by v g(2 Â± \/2)/l. 6. Show that if the mass of the upper pendulum becomes very great compared with the lower one, the solution of Prob. 5 approaches that of Prob. 8, Chap. IV. Show in the other limiting case, where the upper mass is small compared with the lower one, that the motion consists approxi- mately of an oscillation of the large mass with a period derived from the combined length of both pendulums, and a more rapid oscillation of the small mass back and forth with respect to the line connecting point of support and large mass. 7. Given an ellipse ax 2 + bxy + cy 2 = d, perform a rotation of axes so that the new coordinates will lie along the major and minor axes of the ellipse. From this rotation, find the angle between the major axis and the x axis, in terms of the coefficients a, 6, c, d. It is simplest to write the transformation directly in terms of the angle 6: x' = x cos 6 + ysia 0, etc. 8. Show that if the equations x' = anx + a x2 y + a lz z, y' = a 21 x + a 2 2y + a 2S z, z' = a 3 lX + Â«322/ + 0332 represent a rotation of coordinates, the a's satisfy orthogonality and nor- malization relations, both of the form anai 2 + a 2i a 22 + Â«3iÂ«32 = 0, a 2 u + a 2 2i + a 2 3i' = 1, and of the form ana 2 i + a,i 2 a 22 + 013023 = 0, a 2 n + a 2 12 + ah 3 = 1. 9. In the rotation of coordinates above, show that the inverse transforma- tion is given by x = a u x' + a 2 iy' + a 3 iz', â™¦ y = a 12 x' + a 22 y' + 0322', z = ai3X r + a 23 y' + 0332'. Prove that the determinant of the a's is equal to unity. 10. Find the components of an arbitrary vector in the rotated set of coordinates given in Prob. 8. Show that the components of grad V, where V is a scalar, in the rotated axes, are dV/dx', dV/dy', dV/dz'; that is, that the gradient is invariant under a rotation of axes (has the same form in the new axes as in the old). 11. Prove that the divergence, curl, and Laplacian are invariant under a rotation. 12. Set up a method for getting the direction cosines of the principal axes of inertia of a body, and the values of the principal moments of inertia, if the moments and products of inertia are known in a particular coordinate svstem. CHAPTER XII THE VIBRATING STRING, AND FOURIER SERIES In this chapter we turn to the discussion of the motion of a continuous medium. There are examples of such motion in one, two, or three dimensions; as a vibrating string in one dimension, a membrane in two, and an elastic solid, or gas, in three dimen- sions. We first consider the motion of a one-dimensional body, or string. Suppose we have a string of length L, mass n per unit length (constant), with a tension T, set into transverse vibrations. From our elementary work, we know that an infinite number of modes of vibrations, or overtones, are possible. For the nth overtone, if it is present alone, the shape of the string at any time is given by sin (ranr/L), where x is the coordinate of a point on the string measured from one end, and the function is proportional to the displacement transverse to the string. The frequency of this overtone is a)â€ž/2ir, where c*>â€ž = {mr/L)\^T/n. Thus if A n 4 is- â€¢the complex amplitude of this overtone, and u is the displace- ment of the point x, we have = real part of ^,A n sin â€” t-C L n = l where we sum over all the possible overtones. Our first task is to derive these results from fundamental principles. 74. Differential Equation of the Vibrating String. â€” Assume that at a given time the string is displaced so that its shape is given by u{x). We consider how this curve will change with time, and consider transverse displacements so small that the tension T may be considered constant throughout the string. Take a short element of the string of length dx aiid mass ixdx. Its acceleration is d 2 u/dt 2 (x kept constant), so that its mass times its acceleration is ix dx d 2 u/dt 2 . This must be equal to the force acting on this element which arises from the tensions. These tensions (which we take equal to each other in magnitude) would cancel each other exactly if the string were straight, but when it is curved, they each give ris* 3 ! to components approxi- 120 THE VIBRATING STRING, AND FOURIER SERIES 121 mately perpendicular to the string which vary with the curvature of the string (see Fig. 19). At any point x, this component is approximately T du/dx, and we work only to the approximation T^- >r C ^ D A e \X a' yr u " B x x+ax Fig. 19. â€” Tensions on an element of string. Vertical component at x is du â€” T sin 0. If we approximate sin by tan 0, this is â€” T^ Similarly at x + dx, the component is + T â€” Â» but now computed at x + dx. ox to which this is true. Thus the total force on the element of string is J/su\ _ (e_u\ l\dx/ x+ d X ydx/t = Tâ€”dx 1 dx* ax ' if we expand the first term in a Taylor's series and retain only the first two terms of the expansion. Thus our equation of motion is d 2 u . â€žd 2 u V-^d* = T d^ dx > or d 2 u m d 2 u dt 2 dx 2 (1) This is a partial differential equation, since it contains partial derivatives. This appearance of partial derivatives is charac- teristic of all equations of motion of continuous media. Since the equation is linear, with constant coefficients, let us try to solve it by the exponential method, assuming u = e i(ut+kx \ as would be suggested by the solution in terms of overtones. The equation of motion leads immediately to â€”/j,io 2 /T = â€”k 2 , determining Â« in terms of k. Combining two exponential solu- 122 INTRODUCTION TO THEORETICAL PHYSICS tions, allowable since the equation is linear and homogeneous, we have u = Ae iut sin kx, or u = Be 1 " 1 cos kx. (2) Now we must introduce the boundary conditions, which tell us that the string is held fixed at both ends, so that u = when x = 0, and when x = L. From the first of these conditions B = 0, and we take only the sine function. From the second, we must have sin kL = 0, or kL = nr, 7 rnr k= T , where n = 1, 2, 3, â€¢ â€¢ Hence the solution is u = Ane^n* sin â€”j-> (3) Li where 3n " T\m ' Superposing the solutions of all the different n's, as we may from the nature of the differential equation, we obtain the solution mentioned at the beginning of the chapter. Our differential equation is a linear homogeneous partial differential equation of second order. As such, any linear com- bination of solutions is itself a solution. But now we have, not a small number of arbitrary constants, but the doubly infinite set A n , n = 1, 2, â€¢ â€¢ â€¢ (A n is complex). This is characteristic of all partial differential equations. Sometimes instead of having an infinite set of arbitrary constants, we have an arbitrary func- tion. In our case the A's are determined by giving the amplitude and phase of each overtone. These must be determined from the initial conditions; that is, from the values of u(x) and u(x) at t = 0. The essential point is that our partial differential equation is equivalent to an infinite number of ordinary differ- ential equations, so that we need an infinite number of constants. 75. The Initial Conditions for the String. â€” Suppose we wish to satisfy initial conditions of the following sort for a vibrating string: at t = 0, the displacement and velocity are given func- tions of x. That is, if the displacement is u(x, t), then u(x, 0) = THE VIBRATING STRING, AND FOURIER SERIES 123 du f( x )> -^r(, x > 0) = F(x). where f{x), F{x), are arbitrary functions. at Now we may write u(x, t) = >. (C n cos oi n t + D n sin wj) sin -=-> (4) using the real form of the function of time, and u{x, t) = ^ ( â€” C n o>n sin co n t + Z)â€žcoâ€ž cos o) n t) sin â€” =-â€¢ re Thus we must have u(x, 0) = /(a:) = ^Câ€žsin i(x, 0) = F(x) - ^> Dâ€ža>â€ž sin ^- (5) u To satisfy either of these conditions, we must be able to expand our arbitrary function in series of sines, and to find the coeffi- cients C n or D n of these expansions. Having found the coeffi- cients, we can at once set up the series for u(x, t). This is a special case of Fourier expansion, and we now proceed to consider the general problem of Fourier series, a question of general interest apart from the application to a string. 76. Fourier Series. â€” We shall state Fourier's theorem. Given an arbitrary function <l>(x). Then [unless (f>(x) contains an infinite number of discontinuities in a finite range, or similarly misbehaves itself], we can write 00 Â±( \ ^-o , ^/ a 2mrx D . 2mrx\ 4>{x) = -s- + >( A n cos -^r- + B n sin 2 â– ^jy-nâ€” x . -Â»â€” x n=l where '*/2 0â€ž ' O r^/2 < 2 f J,/ ^ 2w7rx ^ c 2 f ^ â€¢ 2w,r;c ^ ra\ /In = ^? I <Â£(Â£) cos â€” =r- as, Â£$â€ž = ^ I #(Â£) sin -^r- cte. (6) This equation holds for values of x between â€” X/2 and X/2, but not in general outside this range. The series of sines and cosines is called Fourier's series. Obviously a special case of it could be used in our problem of the string, the case where only the coeffi- cients of the sine terms were different from zero. 124 INTRODUCTION TO THEORETICAL PHYSICS There are two sides to the proof of Fourier's theorem. First, we may prove that, if a series of sines and cosines of this sort can represent the function, then it must have the coefficients we have given. That is simple, and we shall carry it through. But, second, we could show that the series we so set up actually represents the function. That is, we should investigate the convergence cf the series, show that it does converge and that its sum is the function <i>{x). This second part we shall omit, merely stating the results of the discussion. 77. Coefficients of Fourier Series. â€” Let us suppose that <j>(x) is given by the series above, and ask what values of ^4's and B's we must have if the equation is to be true. Multiply both sides of the equation by cos (2mirx/X), where m is an integer, and integrate from â€”X/2 to X/2. We have then A i) 2mirx cos 2 X rx/2 2mirx f X/2 \ _i_ '^ / . 2mrx 2mirx . n . 2mrx 2mirx\ I 7 ~ ^j( A n cos -^r- cos â€” y r B n sin -^=- cos â€ž \>dx. But now we shall show in the next paragraph that J. x/2 2nirx 2mirx , ' n /m cos v cos â€ž ax = 0, (J) -X/2 -X- ' A if n and m are integers, unless n = m, and that /. x/2 . 2mrx 2mirx , _ sin â€ž cos Y ax = 0, -X/2 A A if n and m are integers. Thus all terms on the right are zero but one, for which n = m. The first term falls in with the rule, when we remember that cos = 1. This one term then gives us f. A m | COS 2 * dx = A m jrl â– X/2 ^- ^ as we can readily show. Hence _2f J n Xj- x/2 , N 2mrx , 4>{x) cos v ax. X/2 - x - In a similar way, multiplying by sin (2irmx/X), we can prove the formula for Bâ€ž- THE VIBRATING STRING, AND FOURIER SERIES 125 In our derivation of coefficients, we have used the following results : cos -^ cos -^^ dx = 0, if n, m are different inte- J-x/2 A A . gers, and similar relations with sines. We can prove these very easily from trigonometry. Thus [cos (a + b) + cos (a â€” &)] cos a cos o = 2~ ' so that our quantity is the integral of this, or X . 2t(u + m)x . 2t(w â€” m)x sm â€” ^â€” =^ â€” â€” sin ^ 27r(n + m) 2ir(n - m) X/2 -X/2 But the quantity in brackets is zero at both limits, if n, m are integers, and the result is zero. Such proofs hold in the other cases. The exception, of course, is the case n = m, in which the integrand is \ (cos (4amx/X) + 1), so that, while the first term gives 1 C x/2 X no contribution to the result, the second gives ^ I dx = -^ 78. Convergence of Fourier Series. â€” In this section we shall merely quote results. In the first place, the series cannot in general represent the function, except in the region between â€” X/2 and X/2. For the series is periodic, repeating itself in every half period, while the function in general is not. Only periodic functions of this period can be represented in all their range by Fourier series. If we try to represent a nonperiodic function, the representation will be correct within the range from -X/2 to X/2, but the same thing will automatically repeat outside the range. Incidentally, we can easily change the range in which the function is correct. If we merely change the range of integration so as to be from x Q to x + X, where x is arbitrary, the series will represent the function within this range. The case we have used above corresponds to x = â€”X/2; another choice frequently made is x - 0. Then again, if we change the value of X, we can change the length of the range in which the series is correct. To represent a function through a large range of x, we may use a large value of X. Although the range within which a Fourier series converges to the value of the function it is supposed to represent is limited, as we have seen, there is a compensation, in that within this range 126 INTRODUCTION TO THEORETICAL PHYSICS a Fourier series can be used to represent much worse curves than a power series. Thus the convergence of the series is not impaired if the function has a finite number of discontinuities. It can consist, for example, of one function in one part of the region, another in another (in this case, to carry out the integrations, we must break up the integral into separate integrals over these parts, and add them). The less serious the discontinuities, however, the better the convergence. Thus if the function itself has discontinuities, the coefficients will go off as 1/n, while if only the first derivative has discontinuities the coefficients go off as 1/n 2 , and so on. Differentiating a function makes the con- vergence of a series worse, as we can see, for example, if a function is continuous but its first derivative discontinuous. Then the coefficients go off as 1/n 2 , but if we differentiate, the coefficients of the resulting series will go off as 1/n. There is an interesting point connected with the series for a discontinuous function. If the function jumps from one value U\ to another u 2 at a given value of x, then the series at this point converges to the mean value, (m + w 2 )/2. 79. Sine and Cosine Series, with Application to the String. â€” In the special problem of the vibrating string, the series we require is somewhat different from the general case, in that there are only sines, and not cosines. We are therefore led to investi- gate series of sines only, or of cosines only. Suppose we take oo the series â€” Â° + \iâ€ž cos â€”S^-' the series formed by taking n = l the cosine part of the general Fourier series. Now each one of the terms is even in x; that is, if we interchange x with â€” x, the function is not changed. A cosine series represents there- fore an even function. Similarly the sine series ^5â€ž sin ^^> of which each term is odd, represents an odd function (one for which, if x is interchanged with â€”x, the function changes its sign but not its magnitude). It is well known that any function <j>(x) can be written as the sum of an even and an odd function: 0(z) = hi4>(x) + <f>(-x)] + i[<t>(x) - 4>{-x)], of which the first term is even, the second odd. Thus the cosine part of a Fourier series represents the even part of the function, the sine series the odd part. As a corollary, any even function can be repre- THE VIBRATING STRING, AND FOURIER SERIES 127 sented by a cosine series alone, an odd function by a sine series. Now suppose we are really interested in a function only between and X/2, and that we do not care what the series does outside that region. Then we may define an even function <f> e (x) as follows: it equals the given function <t>(x) between and %&) Fig. 20. â€” A function, with even and odd periodic functions made from it. The even and odd functions, <t> e (x) and <f>â€ž(x), agree with the original function 4>{x) between and X/2. Between and -X/2, <t> e {x) is the mirror image of 4>(x), while <f>o{x) has the opposite sign. Outside the region from â€”X/2 to X/2, both functions repeat periodically with period X. X/2, but has just the same value for â€” x that it has for x (see Fig. 20). Outside the range from â€” X/2 to X/2, it repeats itself. The Fourier representation of <Â£ e will be a cosine series, but will represent our given function # correctly between and X/2. oo Evidently it is the series -^ + ^?Aâ€ž cos â€” ^' where we write n = l the coefficients as the sum of two integrals, A n = -iS. -KT-. X/2 2tvkx 4> e (x) cos â€” yr- dx -X/2 A , t . 2mrx , <Â£( â€” x) cos â– y dx + '-X/2 = X I *^ cos ~X~ i xt* , N 2utx , > 4>(x) cos y dx 128 INTRODUCTION TO THEORETICAL PHYSICS Similarly we may define an odd function <f> (x), which equals <f>(x) between and X/2, but at â€”a: has the negative of its value at -\-x. oo This function is represented by a sine series ^ B n sin *' > n = X where we readily see that B n = v I 0W sin â€” ^- dx. jr â– V w oi" â€” J^" Hence, between and X/2, the same function can be represented by either a cosine or a sine series. But outside this range, the series represent quite different functions. Our sine series can now be applied to the string problem. We are interested in the string between and L. Let us then set L = X/2. The expression then becomes oo <t>(x) = ^>B n sin -^-> (8) n = l where B n = y I ^w sin ~^r dx. This can be used first to find the coefficients C n , from Eq. (5), substituting u(x, 0) for <t>(x), and next to find the quantities D n a> n , substituting u(x, 0) for <i>(x), D n w n for B n , and obtaining D n by dividing through by Â«Â». These formulas then suffice to find the constants CÂ» and D n of the motion of the string, knowing the initial displacement and velocity of every point of it. 80. The String as a Limiting Problem of Vibration of Particles. An excellent insight into the problem of the vibration of a string is obtained by regarding it as a limiting case of mechanical systems with a finite number of particles, having theref ore a finite set of arbitrary constants in the solution. This is the method followed by Lagrange. Suppose we have N equal masses m at the points x = d/2, 3d/2, â€¢ â€¢ â€¢ (N â€” |)d, separated by massless springs, the whole being stretched with a tension T between supports at x = and x = Nd = L. This forms an approximation to the continuous string, if n = m/d, the mass per unit length. We again investigate the transverse vibrations, letting the displacement of the iih particle be Ui. The problem THE VIBRATING STRING, AND FOURIER SERIES 129 is similar to Problem 1, Chap. XI. The force on the ith particle is T T Fi = â€”{ui â€” Ui-i)j â€” {ui â€” Ui+i)-j> except for the first particle, where we have p 2T i \ T and for the iVth, T 2T F N = â€”(u N â€” u N -i)j â€” u n~j- Then, assuming a solution m = d r',w e nave tne N equation 6 of motion in the form (â€”Â» 2 + t) Ci - h = Â° _t Ci + (_ w + wy _ t Cz = -?Ci + (^ -mco 2 + ^Vs - ?C 4 = + (""- ' + T) C3 " f ( - jC N -i + (-mÂ« 2 + ^V* =0. (9) Such a set of equations, all alike in form (except here for the first and last), are called difference equations. As in the last chapter, these have a solution only for certain values of co, given by setting the determinant of coefficients equal to zero. The determinant is now too complicated to handle simply, wherefore we adopt another method of procedure. Suppose we let C } - = e ikj , where k is to be determined from the equations above. All the equations except the first and last take the form _:L*Â«-i> + ( -mco 2 + 2 j )e ik > - -e ifc ^' +1) = -2j cos k + (-mo 2 + 2j) = 0, whence ww 2 = 2^(1 - cos k). (10) or 130 INTRODUCTION TO THEORETICAL PHYSICS That is, for any Â«, we can choose a value of k by this relation, so that all the equations except the first and last are satisfied. These fall into line as well if we set up C = â€”C h and C N+1 = â€” Cat, so that if these conditions are satisfied we have e ik \ or equally well e~ ik] ', or sin kj or cos kj, as solutions of the equations for Cj. These conditions on C and C N+ i are essentially boundary conditions, one at each end of the string, and we readily see that they are satisfied if we make our function zero for x = 0, x = L, as we do if it is sin -y-, where n is an integer. That is, since x is U ~ h)d for the jth particle, we have C in = sin ~(^j - ^ j> (11) so that k = mr/N. We see from this form of C in that C 0n = â€” C ln , and that only those values of n up to N give us different sets of C's. If n is greater than N, then for each integral j we get just the same value of C,â€ž that we had for a certain n less than N, so that the whole scheme repeats itself over and over as n increases, and we really have only N distinct solutions. Similarly in the expression for the frequency, the term 1 â€” cos k = 1 â€” cos (mr/N) is periodic, so that as soon as n becomes greater than N we repeat the frequencies already found. There are, then, just N solutions, each with its frequency and its complex ampli- tude for each particle. This fits in with the single frequency for one particle and the two which we have found for two coupled particles. For each of the N particles there is an arbitrary amplitude and phase, or arbitrary complex amplitude, so that there are just 2N arbitrary constants. The whole solu- tion is the sum, as n goes from 1 to N, of the real parts of Ane^n 1 sin -yâ€” , or Li N = ^ i B n sin ^ cos (wj - â‚¬â€ž). (12) n = l Each one of these terms represents the amplitudes of all the particles when vibrating with a particular .mode of motion, analogous to an overtone of the string. To get the amplitude of the jth particle, we set x = (j â€” \)d. The angular velocity wâ€ž of the nth overtone is given by THE VIBRATING STRING, AND FOURIER SERIES 131 mcoâ€ž 2 = 2^1 - cos ^ (13) 81. Lagrange's Equations for the Weighted String. â€” The equations of motion which we have discussed above may also be obtained readily from Lagrange's method, and we shall set up expressions for the kinetic and potential energies. For the kinetic energy 7\ we have simply T 1 = ^W + ^ + â€¢ â€¢ â€¢ + un*), and for the potential energy V = ^[2wi 2 + (w 2 - wi) 2 + (u 3 - w 2 ) 2 + â€¢ â€¢ â€¢ + 2u N \ and the Lagrangian equations d U(T,- 7) 1 _ d{T x - V) = Q dt\^ diij J dUj lead to the equations already used. 82. Continuous String as Limiting Case. â€” The solution we have found for the set of particles differs in two ways from the solution for the continuous string. First, there is only a finite set of overtones, and secondly, the frequencies are determined by different formulas. Both these differences disappear when the number of particles in the fixed length L becomes infinite. To determine the limiting form of the expressions for the fre- quency, we develop cos ^ina power series for large N. We thus obtain cos y = 1 - si tf ) + 1/WY An) so that w n becomes l/mr\ T _mr IT using Nd = L and m = nd. This agrees with our former result. In this limiting case of infinite N (and infinitesimal d) the expres- sions for the kinetic and potential energies become Ti = $Â«* c - tadr -UXÂ£)'*> (14) 132 INTRODUCTION TO THEORETICAL PHYSICS which may also be derived directly for the case of a continuous string. Problems 1. Taking the case of four particles on a string, derive their displacements in the four possible normal vibrations, and compute their frequencies. Compare these frequencies with the first four frequencies of the correspond- ing continuous string. Put in n = N + 1, and show how the solution reduces to one already found. 2. An actual string is composed of atoms, rather than being continuous, so that it has only a finite number of possible overtones. Assume that it consists of a single string of atoms, spaced 10 -8 cm. apart. Let the string be 1 m. long, and at such tension that its fundamental is 100 cycles per second. Find the frequency of the highest possible harmonic, and show that it is in the infra-red region of the spectrum. Show that in this highest harmonic, successive atoms vibrate in opposite phases. Substances actually have such natural frequencies in the infra-red 3 and they are important in connection with their specific heat. 3. Prove that u = sin u[t â€” (x/v)] is a solution of the partial differential equation for the vibrating string, if v is chosen properly, although it does not satisfy the boundary condition that the string be held at the ends. Con- sider the physical meaning of this solution, and show that it represents a wave traveling down the string with velocity v. 4. Superpose the wave of Prob. 3, traveling along the +x axis, and a similar one traveling in the opposite direction, and show that the sum repre- sents a standing wave of the type discussed in this chapter. 5. Find the wave length of the waves in the string, in the solution we have found in this chapter, and verify the relation v â€” rik between wave length X, frequency n, and velocity v. 6. Proceeding as in Prob. 5, find the velocity of a wave along the weighted string, showing that it varies with frequency. Find a formula for the variation. Fig Artificial electric line. 7. An artificial electric line can be constructed according to Fig. 21, consisting of N identical resistanceless circuits, each containing inductance L, capacitance C, and coupled to each of its neighbors with mutual induc- tance M. Set up the differential equations for the currents tin the various circuits, showing that they reduce to the same form as with the weighted string. THE VIBRATING STRING, AND FOURIER SERIES 133 8. Neglecting boundary conditions at the two ends of the line in Prob. 7, show that a disturbance can be propagated along the line with a definite velocity, as in Prob. 6. 9. A string of length L is pulled aside at a point a distance D from the end, and then released. Thus its initial shape is given by a curve made of two straight lines, and its initial velocity is zero. Find the solution for its motion, and find the amplitude of the nth harmonic. 10. Taking the solution of Prob. 9, for the special case where D = L/2, compute the first five terms of the Fourier series, when t = 0. Add them and plot the sum, showing how good an approximation they make to the correct curve. 11. A string initially at rest is struck at a distance D from the end, at t = 0. Find the intensity in each overtone. Approximate the initial conditions as follows: the initial displacement is zero, and the initial velocity is a constant in a small region of length d about the point D, zero elsewhere. CHAPTER XIII NORMAL COORDINATES AND THE VIBRATING STRING In the preceding chapter, we have worked out the elementary theory of the vibrating string, finding the nature of the possible vibrations and the method of getting the amplitude of the over- tones in terms of the initial conditions. When we begin to ask about slightly more complicated problems, however, we find that it is necessary to go further into the theory. For example, we might be interested in the nature of the forced vibrations under the action of an external sinusoidal force, or the effect of damping on the oscillations. Such questions are easily answered by introducing normal coordinates, much as we did with the two coupled oscillators. These are generalized coordinates, which prove to be closely connected with the various overtones, so that if just one normal coordinate is vibrating, that means that the string is vibrating with the corresponding pure overtone. When we write Lagrange's equations in terms of the normal coordinates, we find that we can introduce external forces easily, and solve such problems. At the same time, the general theory of normal coordinates for vibrating strings, which we shall get into, has particularly interesting relations to many other branches of mathematical physics. We shall gain much more insight into Fourier expan- sion, finding a general theory of expansion of which this is a special case, but which, as we shall find later, includes expansions in Bessel's functions, spherical harmonics, and many other sorts of functions. Such problems are met not only in vibrations, but also in heat flow (for which Fourier series were originally developed), potential theory, hydrodynamics, and in the newest branch of mathematical physics, wave mechanics, or the quantum theory, used in studying atomic structure. 83. Normal Coordinates. â€” In Chap. XI, we investigated the vibrations of two coupled particles and set up normal coordinates to describe the motion. Since we must make a considerable extension of the idea of normal coordinates in the present chapter, it will be best to review the results we have already found. We 134 NORMAL COORDINATES AND THE VIBRATING STRING 135 started with two coordinates, xi and x 2 , describing the displace- ment of the two particles. The normal coordinates X and Y were introduced by a linear transformation X X = a'X.+ ct'Y z % = P'X + /3"F, which proved to be merely a rotation of axes in the xi â€” xi space, so that X and Y were new orthogonal axes in that space. To express the fact that the transformation was just a rotation, we had certain conditions holding between the coefficients: orthog- onality conditions, as a' a" + 0'p" = 0, and normalization conditions, as a' 2 + jS' 2 = 1. We saw that the quantities a', a", /3', /3", had a geometrical meaning: a' was (i â€¢ /), and similar relations for the other quantities, showing that a', &', were the components along the xi and x% axes, respectively, of unit vector along X, and a", 0" similarly were components of unit vector along Y. The object of the rotation to normal coordinates was to separate the variables of the equation of motion, so that each normal coordinate executed a vibration of its own, as X = Ceâ„¢' 1 , Y = De Ua " t . This was equivalent to rotation so that the new axes in the Xix 2 space lay along the principal axes of the elliptical equipotentials of the problem. We can now follow exactly the same model in our problem of the string. We start with the case of n weights separated by springs. By analogy, the displacement of the first weight should be a linear combination of normal coordinates, the coefficients (corresponding to the a's and /3's) being the displacements of tittx 1 the first weight when only one overtone is excited, or sin â€” =â€” Li for the nth overtone. The displacements of these weights are taken to be U\ . . . u N . Then we set up N normal coordinates, 0i . . . <t> N , by the equations "^"1 mrx\ , Wi = >,Â«Â» sm â€”jâ€” <t>* N W2 /,oÂ» sin -t^-2 â– <t> n , etc., (1) where Xj = (j â€” \)d, and the numbers a n are determined by a condition soon to be described. Here the quantities aâ€ž sin nxxj/L correspond to the a's and 0's of the preceding chapter. 136 INTRODUCTION TO THEORETICAL PHYSICS But not only that: the coefficients still satisfy orthogonality and normalization conditions. The orthogonality conditions will be of the form i sii a n a m \ sin â€”^â€” sin â€” ^â€” + sin â€” yâ€” sin â€” f (-. In I . . nicx N . rrnrx N \ A /n . -f- sin â€” j â€” sin â€” j â€” â– J = 0, (2) where n, m are any two indices. This is true, as can be shown by trigonometrical manipulation, though we shall not stop to do it. Similarly the normalization will be sin* ^ + sin* ^ + â€¢ â€¢ â€¢ + sin* ^ = 1. (3) We can satisfy this by proper choice of a n , since the parenthesis is a definitely determined, positive quantity. This is then the condi- tion, called the normalization condition, for determining the constants a n . Since we have as before an orthogonal transforma- tion, we can again get a geometrical interpretation. We imagine an iV-dimensional space, in which the quantities U\ ... u N are plotted as coordinates. Now our transformation of axes is equivalent to a rotation of coordinates in this JV-dimensional space. The normal coordinates <Â£i . . . <j> N represent new orthog- onal axes in the space, in the sense that if 0i = 1, all the other <Â£'s are zero, the corresponding point is displaced from the origin unit distance along the 4>i axis. The quantities like a n sin â€” y^ represent the components in the direction of the old axes Li of unit vectors along the new axes. Thus the one written is the cosine of the angle between the 4> n and the Xj axes. The equa- tions of motion are separated in the new coordinates, the solutions being <f> n = constant X e ia n l . Finally, the equipotentials, which are ellipsoids in the iV-dimensional space, have principal axes in the directions which we have chosen for the normal coordi- nates. Thus the analogy with the two-dimensional problem is complete. The statements we have made without proof here are not very difficult to demonstrate, and some of them are taken up in problems. We can now go one step farther, to the continuous string. Here the displacement of a point of the string is given by u(x), where x measures the coordinate of the point, corresponding to the in for the problem of discrete weights. We introduce normal NORMAL COORDINATES AND THE VIBRATING STRING 137 coordinates <j> h . . . <j> n , . . . , (an infinite set, as there are an infinite number of points on the string), by the equation u{x) = ^Sa n mx r ^ 4> n . (4) The orthogonality conditions for the coefficients a n sin -y- must now be written in terms of integrals, rather than sums; for we have terms for each value of x, from to L, differing by infinitesi- mal amounts. Thus these conditions are a n a m sin -â€” sin â€” â€” - ax = 0, (5; X L L /: where n and m are different integers. This can be immediately proved by evaluating the integral. Similarly the normalization condition is aâ€ž 2 sin 2 ^ dx = 1. (6) which as before serves to determine a n . 84. Normal Coordinates and Function Space. â€” We must now imagine a space, not of N dimensions, but of an infinite number. We cannot get an idea of what the coordinates mean, except by passing to the limit from the case of a finite number of mass points. With N points, and N dimensions, the first coordinate measures the displacement of the first mass point, and so on. Thus a point in the iV-dimensional space determines all the coordinates, or in other words gives the displacements of all the masses. Now as N gets larger and larger, and we have more and more dimensions, it still remains true that a particular coordinate measures the displacement at a particular part of the string. We see that this interpretation persists to the limit of infinitely many variables: each coordinate is connected with a point of the string, and its value gives the displacement at that point. But there is now an interesting side light on the situation. A point in our infinitely many-dimensional space gives complete information about the displacement of each point of the string. That is, it gives u{x), a function of x. Each point of this space is connected with a particular function, and each possible function is represented by a point of the space (of course, many points of the space refer to discontinuous functions and, therefore, are 138 INTRODUCTION TO THEORETICAL PHYSICS not suitable for describing a string). On account of this prop- erty, our space is often called a function space. The normal coordinates now represent a set of rectangular axes in function space, rotated with respect to the original coordi- nates. Each normal coordinate refers to a particular mode of vibration v or overtone. If just one of the normal coordinates is excited, say if <f> n = 1, all the other <Â£'s being zero, the situation is represented by a certain point in function space; that is, by a certain function, giving the shape of the string. We can take the radius vector out to the point <f> n = 1, all other <j>'s = 0, and project it on one of our original coordinates. Thus the projection on the coordinate connected with the point x is a n sin (rnrx/L), showing that that is the displacement of this particular point of the string when this overtone alone is excited with unit amplitude. The expression a n sin (nwx/L) is now a function; it is the function represented by a unit vector along the <t> n axis, in function space. Since the <Â£ axes are orthogonal, we see that the scalar product of two such vectors along different axes is zero : J* L . nirx . rrnrx â– , n a n a m sin -â€” sin â€” =- ax = 0, o L, L where by analogy the scalar product takes this form, so that we have the orthogonality conditions and their geometrical meaning. Similarly the square of the unit vector, which is unity, is a n 2 sin 2 ^- dx = 1, L the normalization condition. This immediately gives aâ€ž 2 L/2 = 1, a n = y/2/L. Now as before, when we introduce the normal coordinates, we have rotated axes in function space to make the new coordi- nates lie along the principal axes of the ellipsoidal equipotentials. And the equations of motion are separated, each normal coordi- nate vibrating with simple harmonic motion: <Â£â€ž = A n eW. Finally, then, the motion is represented by J" 00 u(x) = 2*\I sin TT ^ n=l n = l NORMAL COORDINATES AND THE VIBRATING STRING 139 agreeing with the value found previously. In Section 86, we carry out the demonstration that the equations of motion are separated in the normal coordinates, and then we apply them to the discussion, of forced motion. 85. Fourier Analysis in Function Space. â€” When we come to the question of satisfying initial conditions, and of Fourier's series, we meet immediately close connections with function space. Fourier's theorem, stated for sine series,, can be put in the following form, by introducing terms \/2/L: f(x) = ^ (VL/2 B n ) s/2/L sin rnrx where VL/2 B n = \ f{x)s/2/L sin ^ dx. Jo L (7) Now the functions \/2/L sin (mrx/L) are the unit vectors in func- tion space along the directions representing the overtones of the problem â€” functions which are often called the normal functions, or characteristic functions, of the problem. Thus Eq. (7) is just like a vector equation, stating that a vector (f(x)) is the sum of unit vectors [V2/L sin (mrx/L)] each multiplied by the com- ponent of the vector along the corresponding axis (y/L/2 B n is the component of f(x) along the nth axis) . To find these com- ponents, we need only project the vector f(x) on the corres- ponding unit vector, which means taking the scalar product. But the scalar product, as we have seen, is ah integral, f(x) VVL sin V^~\ = v^/2 B n = f/CaOv^/Lsin^dz. (8) Jo L Thus the formulas of Fourier's method have the simplest possible vector interpretation in function space. But we can also see that, if we had some other set of normalized orthogonal functions, we could proceed with an expansion in an analogous way. It is worth noting that by using Fourier's method, we can solve for 00 the normal coordinates in terms of u(x): since u(x) = ^ -\/2/L n = l 1 40 INTRODUCTION TO THEORETICAL PHYSICS . mcx sm T~ 4> n , it is obvious that <t> n = I u(x) y/2/L sin -=- dx, the component of u(x) along the nth axis. 86. Equations of Motion in Normal Coordinates. â€” To find the equations of motion, we must set up the Lagrangian function. Let us first write for the velocity of the string = <f> x y/2/L sin j- + 4>2 V2/X sin -=- + â€¢ â€¢ â€¢ + u Li U nirx 0â€ž v 2/L sin T and proceed to the expressions for the potential and kinetic energies. We have / oo \ 2 \n = l 00 . HTX \ 7 sin -Jâ€” I dx n = l since all the product terms disappear because of the orthogonal properties of the normal functions. Thus Tx becomes reduced to a sum of squares in the generalized velocities and the integra- tion over x leads to the result ^1 = 12^'- (9) n = l In a similar manner we set up the expression for V, the potential energy. We have * - 1 f (9 2 * - \ f (! Â«â– ^ t - n 4) *> which we treat exactly as in the case of the kinetic energy and obtain V as a sum of squares of the generalized coordinates 0â€ž, namely v = Â£2Â£**- (10 > n = l Using the Lagrangian equations of motion, we have U = T, - 7, NORMAL COORDINATES AND THE VIBRATING STRING 141 d/dL x dt\d<j> r , d(dL\ x 1 _ _(^' 2 d<f>~ so that the equations of motion become aL : - -(r)' 1 *- te) M<Â£Â» + I -f ) T<f> n = $ n , n = 1, 2, â€¢ â€¢ â€¢ , (11) where <Â£â€ž is the generalized external force corresponding to the coordinate <f> n . Up to this point we have considered only free vibrations for which $â€ž is zero, but now we have generalized our problem to include such things as forced oscillations. We solve the equations above for the case of free vibrations, obtaining with _mr It ,n ~ ttVm so that our expression for u is just the one originally found, in Eq. (3), Chap. XII. It is thus clear that we have essentially used normal coordinates in our first discussion of the vibrating string. The generalized force $â€ž is defined so that the work done by the external force during a displacement d4Â» n is $ n d<]> n . During a displacement d<f> n , the corresponding displacement of the string is aâ€ž sin -^-d<t> n = du, so that if the force acting on a length of Li string dx at time / is fdx, we find for <l> n , *â€ž = V2jL J Q f sin ^ dx. (12) In function space, this is evidently simply the component of $ along the nth axis. An interesting case occurs when the force acts practically at a point x â€” a, such as when a violin string is plucked or bowed. We then write <K = V2/L sin ^ l f dx - F V2/L nira sin â€” =r-- L This expression brings out the advantage of the concept of generalized force. For example, if a string is struck or bowed at 142 INTRODUCTION TO THEORETICAL PHYSICS its center, then a = L/2, and $â€ž = when n is an even integer. This means that this force can have no effect on the even over- tones and can only affect the odd overtones. If the string is originally at rest, no matter what kind of force is applied at the center, only odd overtones appear in the resultant vibration. No even overtones ever occur, as the normal coordinates are uncoupled and each normal coordinate behaves just as if all the others were absent. These conclusions, immediately obvious from the expression for <S> n , are not at all obvious if one considers only the usual force acting at a point of the string. Another case of interest occurs when a periodic force acts on a point x = a of the string. We then have $ n = F s/2/L sin -y- cos cot and the equation of motion for <j> n is very much like the equation of forced motion of a one-dimensional oscillator. The solution of this equation is then . N F VtyL sin (mra/L) , â€¢"- <pn = A n COS (C0 n c ~ in) ~\ 7 5 S\ COS COl. H{0) n Z â€” CO 2 ) The first term is the solution of the homogeneous equation and represents the free vibration of this mode, and the second repre- sents the forced vibration indicating all the characteristics of resonance which we have previously studied. 87. The Vibrating String with Friction. â€” Thus far we have neglected friction forces which must act in real eases. Let us assume that the motion of our string is opposed by a frictional force such that the force on each element of the string is propor- tional to its velocity. The partial differential equation of the free motion of the string becomes d 2 u â– 7.3w _ T d 2 u + k^ = -~- . (13) We can treat this problem rather simply by noting that there is a function G, called the dissipation function, which is one-half the rate at which energy disappears from the system and which has just the same form as the kinetic energy T\. In fact, we have G = - iik I u 2 dx. 2 Jo NORMAL COORDINATES AND THE VIBRATING STRING 143 One can easily show that the Lagrangian equations when there is a dissipation function become d/dLA _ dLt ,dG_ Q (u , dt\ dq-J % + dq, " Wl * U ; According to the special law of friction we have assumed, the dissipation function has the same form as T\, so that if we intro- duce the normal coordinates fa, fa, . . . etc., which we found reduced the expressions for T\ and V to sums of squares, they will also do the same for G, so that we can separate the equations of motion for each coordinate fa. Proceeding as in the last para- graph, we find G = y /*}<i>n 2 The equation for fa then becomes fa + kfa + 0) n 2 fa = -*Â» (15) M which is the same form as in the case of a one-dimensional damped oscillator. From this we see that each of the overtones has the same logarithmic decrement, so that in a free vibration the vari- ous overtones maintain their relative amplitudes. In the case of a forced vibration caused by a periodic force F cos cot acting at the point x = a, we have <& n = F\/2/L sin -y- cos cot, and the steady-state vibrations are given by r^~FF â€¢ nira[ (coâ€ž 2 â€” ca 2 ) cos cot + "& sin cot \ ... _ N 2 / Lsin TrL â€” k'-^' +'(â– *)' â€” } (16) This is, of course, essentially the same solution we obtained in the discussion of a one-dimensional oscillator. The particularly simple solutions just obtained depend entirely on the simple form of the law of friction we have assumed. In general, for vibrating systems, the presence of frictional forces does not prevent us from setting up the kinetic and potential energies as a sum of squares. But this transformation will in general not transform the dissipation function G to a sum of 144 INTRODUCTION TO THEORETICAL PHYSICS squares. Only in very special cases, such as the law of friction assumed above, does the transformation also reduce G to a sum of squares. The general equation of motion for the coordinate <Â£i, for example, will be of the form ai0i + ci4>i + Â£fbu<i>i = $1 instead of the simpler form obtained above in which we have only <j>i appearing. Thus in the general case of frictional forces there is coupling between the various coordi- nates so that we have much more complicated types of motion. In Chap. XI, Prob. 3, we had such a case with two coupled circuits with resistance and found that we could get a simple solution only for very small frictional forces. Problems 1. Write down the Hamiltonian function for a vibrating string, using normal coordinates. Set up Hamilton's equations, and show that they are satisfied for the solution we have found. 2. A sinusoidal force of constant amplitude but adjustable frequency acts on an arbitrary point of a string. The string is in addition damped by a frictional force proportional to the velocity. Discuss the resonance of the string to the force, computing, for example, the total energy of the string as a function of the applied frequency, and showing that the resulting resonance curve goes through maxima corresponding to the various over- tone frequencies. Find approximate heights and breadths of the maxima. Neglect the transient vibrations. 3. Prove the orthogonality relations for the normal functions for the weighted string; that is, prove sin -tâ€” sin â€” j V â– â– â€¢ + sin â€” j- sin â€”fâ€” = 0. 4.. Using the orthogonality relations of Prob. 3, and the analogy of the continuous string, set up a method for finding the amplitudes of the various overtones of the weighted string, in terms of the initial displacements and velocities of the particles. 5. Apply the method of Prob. 4 to the special case of two coupled par- ticles, as taken up in Prob. 1, Chap. XI. 6. Apply Prob. 4 to the case of four particles, as in Prob. 1, Chap. XII. 7. Consider two coupled mechanical vibrating systems, with friction. In general, a dissipative function cannot be set up, and the problem of the motion cannot be solved exactly. Show what relations the frictional forces must satisfy in order to have a dissipative function. Write-down the corresponding relations also for the electrical case. NORMAL COORDINATES AND THE VIBRATING STRING 145 8. What sort of force must be applied to a string in order that the forced motion should be a pure vibration of the nth harmonic? 9. Consider the case of two coupled particles as in Prob. 1, Chap. XI. Show that if equal external forces act on both, the overtone in which they vibrate in opposite directions can never be excited. 10. In the case of the two coupled particles of Prob. 1, Chap. XI, assume that at t = both particles are at rest, but that one particle is displaced a distance d, the other not being displaced at all. Find the amplitudes of the two overtones, writing down the formulas for the displacements of each particle as functions of time. CHAPTER XIV THE STRING WITH VARIABLE TENSION AND DENSITY In the last two chapters, we have considered the problem of the vibration of a string of constant density and uniform tension. These results may now be extended for the more general case of variable tension and density. We shall not be able to carry through the results in complete detail; for, as we shall see, we are led to a more complicated differential equation, which we cannot solve in general. But we shall find that the theory of expansion in orthogonal functions, and all the general relations, go through just as with the uniform string, so that we can derive a good deal of information. We shall also develop perturbation methods, which can be used when the tension and density have only small deviations from constancy. The importance of the problems considered in this chapter arises more from what they suggest than from the specific problems considered. Strings of variable density are of small practical importance. But the string is the simplest case of a vibrating continuum. Waves in three dimensions resemble waves on a string. A string of variable density resembles an optical medium of variable index of refraction, and we meet problems of reflection and refraction. Many three-dimensional problems can actually be reduced to one-dimensional cases, and these are all likely then to take on just the character of our string of variable density. It forms, so to speak, the type for much of our more complicated work. In wave mechanics, for instance, most of our problems reduce to a mathematical form which is identical with that of the present chapter. The perturbation theory we develop in this chapter is one set up originally for use with variable strings, yet it has had most important effects in the development of the quantum theory. 88. Differential Equation for the Variable String. â€” We set up the differential equation of motion exactly as we have done in Chap. XII. In calculating the resultant force on an element dx of our string we found (ifj)^ - (if^.andtimJs^lf^l*, 146 THE STRING WITH VARIABLE TENSION AND DENSITY 147 which reduces as before to T-^dx for constant tension. The remainder of the derivation proceeds as before, and the equation of motion becomes: where both T and /x are now functions of x. If we assume that u is proportional to a function of x times e iwt , we find that we get an equation for the function of x alone : |(^) + .VÂ« W =0, (2) where this u(x) is the part of u depending on x. 89. Approximate Solution for Slowly Changing Density and Tension. â€” The above Eq. (2) is a linear second-order differential equation with variable coefficients, on account of the functions T and /*, which depend on x. We can give no general method of exact solution, except the power series method. To apply that, of. course, T and /* must be expressed as power series in x. But it turns out that the solutions of the equation are not very differ- ent from sines and cosines of x, and a very useful approximate method of solution is based on this fact, good when the density and tension do not change by a large fraction of themselves in one wave length. This approximate solution is simple, and forms a convenient method for discussing the equation qualita- tively. The effect of the variable density and tension comes in two ways: first, the wave length depends on the position, and second the amplitude depends on x. Thus, instead of A sin 7 ^-, as with the uniform string, the actual solution for the func- tion of x can be at least approximately written in the form u â€” A(x) sin B (x). We can see easily the form which B must have for the nonuniform string. For plainly ^â€”= must measure the number of wave lengths between x\ and x 2 , on account of the way in which B appears in the sine function. But now if X is the wave length, regarded as a function of x, dx/\ is just the number of wave lengths in distance dx, so that J* X2 dx â€”, from which evidently XI ^ 148 INTRODUCTION TO THEORETICAL PHYSICS B(x) = 2x / dx/\. Since the wave length can also be written 2tt/\ = coy/n/T, this is equivalent to B(x) = to J s/yjT dx. It is not hard to show that if we set A = â€” tt=â€”> the resulting expression A .â€ž constant . r /â€” 7â„¢, A e lB = â€” e io,)V*/Tdx } (3) or the corresponding real quantity constant , r cos (coj " </7r forms an approximate solution of the differential equation. To prove this equation, we may proceed as follows : we assume the solution u = A e^JVi^^ where A is an undetermined function of x, and substitute in the differential equation. When the necessary differentiations and substitutions are performed, we obtain a differential equation for A, which may be written, after a little manipulation, Ul d * A 1 dT 1 dA\ \A dx 2 + f dx A dx) + where X = â€” */â€” , the wave length of the disturbance. Now 0} \ n we are assuming that /*, T, and consequently A, do not change by a large fraction of themselves in a wave length. Thus the quanti- 1 dA ties like X-r -t-> measuring the fractional change in A in a wave length, are numbers small compared with 1. Their squares, then, and their rates of change in one wave length, can be neglected, and that means that the first set of terms above, in X 2 , can be neglected in comparison with the second set, in X. Considering only the latter terms, we can rewrite the Eq. (4) din 4 \( d In T din A + 4\ dx + dx ) ' = 0, A(jtT)H = constant, dx d\n{A(jiT)*) dx giving the solution we wished to prove THE STRING WITH VARIABLE TENSION AND DENSITY 149 90. Progressive Waves and Standing Waves. â€” In the problems of Chap. XII, we noted that there were two sorts of waves possible in a uniform string: progressive waves, and standing waves. The progressive waves traveled along with a velocity v; an example was cos u>(t â€” x/v), in which the displacement has the same value at all points for which i â€” x/v = constant, or x = vt + constant, points traveling along with velocity v. Similarly in our general case, we can set up a complex solution constant iu y~jT) where v = y/T/n. The real part is constant cos col (â– - /') </7r fdx where the equation t â€” I â€” = constant gives, by differentiation, dx/dt = v, verifying that the velocity of propagation of the progressive wave is v = s/T/n, varying from point to point along the string. Thus in the general case we can have a progres- sive wave along the nonuniform string. We shall see later in the chapter, however, that this is only approximately true for strings with slowly varying density and tension. At a rapid variation of constants, a reflected wave is set up, traveling in the opposite direction, and the superposition of direct and reflected waves eventually produces something more like a standing wave. An example of a standing wave with a uniform string is ... x . . constant . . . Cdx sin o3i sin w-> or in the general case â€” A . â€” . sin cor sin to I â€” v & ^/nT J Â» This is a product of a function of t and a function of x, so that such a wave has nodes, values of x for which the function of x is always zero, so that the vibration always has zero amplitude. We have seen that by combination of two progressive waves we can build up a standing wave; similarly by adding two standing waves we can get a progressive wave, as we see from the relation cos coÂ£ cos co I h sin co* sin co I â€” = cos col t - I â€” )â€¢ J v J v V J v/ Thus either sort of wave satisfies the differential equation, and we can add solutions as we always can with homogeneous linear differential equations. 150 INTRODUCTION TO THEORETICAL PHYSICS Now suppose a string is held at one point. That means that we must limit ourselves to a particular set of solutions of the differential equation: the standing waves which have a node at that point. Thus in our approximate solution, we must take the space function constant . C x dx sin oj where x is the point where the string is held. Suppose we imagine a semi-infinite string, held at one point, with a wave train of finite length approaching the end. The wave is reflected from the end, travels back, and the superposition of the two trains, in opposite directions, forms the standing wave. This wave will have nodes at definite points on the string. It may- have any arbitrary frequency, but the nodes will be differently spaced with different frequencies. If now the string is held at two points, instead of one, we meet a difficulty: with an arbitrary frequency, the string will not have a node at the second point. We must limit our frequency to one of the discrete set for which there are nodes at both ends. Thus the fact of having the string held at both ends automatically sets up a discrete set of possible frequencies of vibration, the overtones, with a particular form of vibration for each. We let the nth overtone have a wave form represented by u n (x), an angular frequency Â«Â». Thus the whole solution may be written u = V(Aâ€ž cos w J + B n sin wj)u n (x), (5) n where the constants A n and B n are chosen to satisfy the initial conditions at t = 0. If our analytic approximation to the function is good, we have , . constant . C x dx fa \ u " (x) ' "W sm ""J,^' {) with v = VT/fjL. Since the displacement is zero not only at x , but also at the other end xi, we must have COn I X1 dx rt n --2*1 (7) where n is an integer, which as we readily see equals 1 when there are no nodes between the ends, 2 when there is one node, etc. This leads at once to the condition THE STRING WITH VARIABLE TENSION AND DENSITY 151 mr (8) dx/v for the angular velocities, which for the uniform string reduces to where L = xi â€” x is the length of the string. If Wn ~ T V 7 our analytic approximation to the functions u n is not good, we must simply choose those particular functions for our w n 's which have nodes at Xi and Â£ 2 , labeling them in order, the one with n â€” 1 nodes between the ends being called u n> and then must find the angular frequencies connected with these particular functions. We meet such a case, for example, in some of the prob- lems, where the functions u n are Bessel's functions, and where we simply must look up the nodes in tables of the roots of Bessel's functions. The particular functions u n satisfying both differ- ential equation and boundary conditions are called normal func- tions, or characteristic functions, or wave functions, and the frequencies co n are sometimes called characteristic numbers. 91. Orthogonality of Normal Functions. â€” We can now prove easily, and quite generally, that the normal functions u n are orthogonal. For this purpose we consider two normal functions u n and u m , which are solutions of the differential equation. We then have the identities A dx and d We multiply the first equation by u m , the second by u n , subtract one from the other, and then integrate over the string, which we assume to extend from x = to x = L. We thus obtain n-=( r Â£)â€”s(*Â£)]*'- = k 2 â€” co n 2 ) I n(x) U n U m dx. JO The left side integrated by parts yields immediately \ T\u â€” â€” u ^A1| L _ f L rr( dUn dUm â€” ^n dUn \ d [_ \ m dx dx /J|o Jo \dx dx dx dx J 152 INTRODUCTION TO THEORETICAL PHYSICS The integral obviously vanishes, and the integrated part vanishes since both u n and u m are zero for x = and x = L. In general the integrated part would vanish if either u or du/dx vanished at the boundaries, or if an expression of the form u + a du/dx vanished at each boundary. Thus the right side of the equation above yields us as the analogue of our former orthogonality relation I n(x) u n u m dx = 0, if n 9^ m, (9) since, when n = m, the integral need not vanish to satisfy the original equation. We shall assume the functions to be nor- malized so that CV*) u n * dx = 1. (10) In the previous chapter, where the density n was independent of x, we could simply omit that factor in the integrals, changing the normalization condition to Jw 2 dx = 1, without any error other than a change of a constant factor in the functions u s . Here, however, the density factor must be kept in. We can see the analogy to the corresponding situation with the two coupled particles. There, if the masses of the particles were m h ra 2 , and their displacements were y h y 2 , we had to set up new quan- tities xi, xi, equal to a/wiI/i and Vwfi, respectively. We could give the normalization conditions by stating, for example, that the unit vector along X has unit magnitude. The coordi- nates of the extremity of this vector, in the notation of Chap. XI, were xi = a', x 2 = ft '. Squaring the magnitude of this vector, we had the normalization condition Xt 2 + X2* = Â«' 2 + /3' 2 = 1. But this is equal to m x y^ + W22/2 2 , where the y's are the actual displacements. Thus in that case, just as here, we must weight the squares or products of displacements, where they appear in the orthogonality or normalization conditions, with the respec- tive masses. Here the term n(x)dx is just the mass of the ele- ment dx, so that the analogy is complete. 92. Expansion of an Arbitrary Function Using Normal Func- tions. â€” We have seen that we can write our solution u = V(Aâ€ž cos w n t + Bâ€ž sin <aj)u n (x). THE STRING WITH VARIABLE TENSION AND DENSITY 153 If the initial conditions are u(x, 0) = fix) and -^(x, 0) = Fix), where u(x, t) is the function of coordinate and time, we have, substituting in our general solution, n F(x) = ^BnUnlln, (11) n and we have the general problem of expanding an arbitrary func- tion in a series of normal functions, very much like our previous problem of expanding an arbitrary function in a Fourier series. As before, we shall content ourselves with showing that we can find expressions for the coefficients A n and B n which formally satisfy this type of expansion. The remainder of the problem, showing that the series so built up really represents the function and that it converges, will not be taken up here. -It is sufficient to say that such proofs can be given. Let us multiply each of Eqs. (11) on both sides by fi(x)u m , and integrate from x = to x = L. Ws thus have f nix) u m f(x) dx = ^Â£fA n J nix) u m u n dx and f n(x) u m F(x) dx = 2yO>â€ž B n f n(x) u m u n dx. On the right side of each of these equations each term for which m 9* n vanishes because of our orthogonality relations. The remaining term contains an integral which has the value unity if the functions u n are normalized. Thus the whole sum reduces to A m (or in the second equation to u m B m ), and we have found expressions for our coefficients : Am = J n(x) f(x) u m dx, and B m = â€” I /i(x) Fix) u m dx. (12) w m jo It is clear that our discussion of the Fourier expansion is but a special case of the general one here discussed. The most con- venient point of view to take is to define the scalar product of two functions f(x) and <Â£(a;) as jÂ£ n(x) f(x) <t>(x) dx. 154 INTRODUCTION TO THEORETICAL PHYSICS Then clearly our orthogonality and normalization conditions are just what we should expect from our discussion of orthogonal vectors in function space, in the last chapter. The rotation of coordinates in function space again separates variables, as it did in the case of the uniform string; but now the separate normal or characteristic functions are more complicated in form, as we see from the more complicated differential equations they satisfy, though they still vibrate sinusoidally with time. When we carry out an expansion of a function f(x) in terms of the characteristic functions, the coefficients, as with the Fourier expansion, are just the scalar products of the correspond- ing characteristic functions with the given function, or J q fi(x) f(x) u n dx, as we wrote above. 93. Perturbation Theory. â€” One approximate method of inte grating the differential equation of the nonuniform vibrating string has already been indicated, making use of the resemblance of the actual functions to sines and cosines. An entirely differ- ent approximate method, the method of perturbations, is also frequently useful. This is a method which applies if the problem is very nearly a soluble one, the density and tension varying only slightly from their values in the soluble case. The usual application is to an almost uniform string. For simplicity we consider only the case where the tension T is a constant, while the density is a function n(x), almost equal to no(x), for which the problem can be solved. We assume that we know the char- acteristic functions u n Â° and frequencies w n Â° for the soluble case, satisfying, therefore, the differential equations T^- + co n Â°Vo(*KÂ° = 0. (13) We now remember that the functions u n Â° form an orthogonal set, and that any arbitrary function can be expanded in series of such functions. Thus in particular the nth characteristic func- tion u n of the real problem can be so expanded : u n = ^iâ€žfcÂ«tÂ°. (14) k We may regard our problem as that of determining the constants A n k. Considered in function space, this problem is very simple. THE STRING WITH VARIABLE TENSION AND DENSITY 155 The functions u k Â° form one set of orthogonal unit vectors, the u n 's another, and these equations merely express one set in terms of the other; they are the equations for a rotation of coordinates in function space, from the axes characteristic of the "unperturbed" problem with density /t to the "per- turbed" problem with density /x. The easiest way of getting at the conditions for rotation is simply to substitute u n in the differential equation which we wish it to satisfy, If we do so, and use the differential equations which w n Â°'s satisfy, we have easily k Now we may multiply by an arbitrary u m Â°, and integrate from to L. Remembering that the wÂ°'s are orthogonal, the result is 2)A nfc (cO fc Â°VÂ°m fc - Â«nVÂ«*) = 0, (15) k where nÂ° mk = J^ote) â„¢ m Â° u k Â° dx = 1 if m = k, if m ^ k, and Hmk = f L n(x) u m Â° u k Â° dx, a quantity differing from nÂ° m k only by small quantities of the order of the deviation between n and n . We have here an infinite set of simultaneous homogeneous linear equations (w can take on any value) for the unknown constants A nk . These can be written, for a given n, A nl (0)lÂ° 2 - CdnVll) + A n2 (-0VW + An8(-Â«Â»W + * * * =0 Aâ€žl(-C0 n 2 iU2l) + A n2 (c02Â° 2 ~ C0 M 2 /X2 2 ) + * ' * - =0 Aâ€žl( â€” C0 n 2 /i3l) + * * ' ' = . ... =0. (16) In general these will have no solutions; the condition for existence of a solution is that the determinant of coefficients vanish. This forms an equation for co n 2 , called a secular or determinantal equation, and just analogous to that which we found with the problem of two coupled vibrations, when we made a rotation of coordinates, and we recognize it as the general type met in 156 INTRODUCTION TO THEORETICAL PHYSICS such problems. In this case, the equation has an infinite number of roots, one near each unperturbed frequency. It is hardly feasible to solve the determinantal equation directly, though it is not hard to make an approximation to it. It is easiest, however, to proceed directly from the linear equa- tions. If the wÂ°'s are nearly the same as the w's, it is plain that we shall have A n k = 1 almost, if n = k, or = almost, if n ?Â£ k. The only term in the equations which is large and need be considered is then that for which n = k (so that A nk will be large) and simultaneously m = k (so that nÂ° mk and n m k will be large). This term gives o 2 A nn (a) n Â° â€” uâ€žVÂ«) = 0, or o) n 2 = â€” ^-- MÂ»Â« If now ix = /xo + /ii, where mi is small compared with mo, we have Vnn = 1 + I mi UnÂ° 2 dx, so that, using the first term of a binomial expansion, coâ€ž 2 = uj 2 (l - J^mi U n Â° 2 dx}, (17) correct to the first order of small quantities, but neglecting terms of the order of the square of the integral of ml It is not hard to get expressions of the same order of accuracy for the A's. 94. Reflection of Waves from a Discontinuity. â€” We mentioned earlier that a progressive wave striking a discontinuity of density would be partly reflected, and only partly transmitted. It is easy to solve exactly the problem of propagation of the wave over the discontinuity, and as this is one of the exactly soluble cases of the vibration of the nonuniform string, and is the simplest problem of reflection, it is worth carrying its discussion through. Let us assume two uniform strings of different densities attached to each other and subject to the same tension T. Let the first string have a linear density mi and the second a density M2. We shall take the point of junction as x = 0. We thus have different velocities of propagation Vi = -y/T/ni andv 2 = \/? 7 /m2 in the two strings. We may also define an "index of refraction" of one medium with respect to the other as n = Vi/v^ = VM2/ML At x = we must satisfy certain conditions at every instant of time. First, the displacement u must be continuous across the boundary if the strings remain joined together, and secondly, the slope du/dx must also vary continuously across the boundary. THE STRING WITH VARIABLE TENSION AND DENSITY 157 Were the latter condition not fulfilled, we would have the impossi- ble situation of a finite force acting on an infinitesimal piece of the strings at the junction. Let us consider a harmonic progressive wave in the first string (ah) impinging on the junction. In the second string we shall have a wave traveling in the same direction as the impinging wave, but in order to satisfy the boundary conditions, we must assume a reflected wave in the first string. Thus Ul = Ae V Xl ' + Be \ *Â»/ . and U2 = Ce v _ x Â»'. The frequency is a fixed characteristic of the wave, independent of the medium in which the wave is propagated. The wave lengths Xi and X2 are related by the condition = 1} = 2!?, Xi X2 or x 2 = n. Mi At the junction, where x = 0, we have (t*i)o = Ae 2 * iyt + Be 2irivt (w 2 )o = Ce^ ivt , and \ dx J Xi Xi /duA = _27rz (7e2xij , t Thus the conditions of continuity give A + 5 = C, and A _ Â£ = C Xx Xi X 2 whence 5_X 2 â€” Xi__nâ€” 1 A Xi + X 2 w + 1 giving the ratio of the amplitude of the reflected to the incident wave. Two limiting cases are interesting: if ju 2 = <x>, so that 158 INTRODUCTION TO THEORETICAL PHYSICS the junction is held fast, we have n = Â«, B = â€”A, or the wave is entirely reflected, with a change of phase. The other case is M2 = 0, the junction is free, and we have n = 0, B = A, reflec- tion again being complete, but with no change of phase. In both these cases the incident and reflected waves combine to give standing waves. Problems 1. A heavy uniform flexible chain hangs freely from one end. The chain performs small lateral vibrations. Show that the normal functions are u n = Jd-~\/xY where J represents the Bessel function of order zero; a; is the distance from the bottom of the chain to any point, g the acceleration of gravity and o?â€ž is the angular frequency of the nth. mode of vibration. For a chain 8 feet long, find the periods of the first few modes of vibration (use Jahnke Emde's tables to get the roots of the Bessel functions). 2. One end of a uniform flexible chain of length I is attached to a vertical rod which rotates at a constant angular velocity Qo- Neglect the effect of gravity, so that the chain stands out horizontally under the tension of centrifugal force. Show that the differential equation for small vibrations transverse to the length of the chain is fak -*â– >Â£]+*"-* Introduce the variable y = x/l, and solve the resulting equation by the power series method. The boundary conditions are w(0) = and u for y = 1 must remain finite. Note that the latter condition can only be fulfilled if the series breaks off to form a polynomial. Calculate the first three polynomials and derive a relation for the frequency of the nth mode of vibration. The polynomials so found are the Legendre polynomials of odd order. 3. A string stretched with a uniform tension T, and with a density a/x 2 , is held at the points x = Xi and x = x%. Solve the equation, using the form u = s/x z, and show that the general solution is u = Ax i+ik + Bx*- ik , where k is defined by k 2 + yi = u> 2 a/T, and w is the angular velocity. Show from this that the general form of the normal function is /- . W7T In (x/xi) , â– \/x sin -: â€” t â€” T-^-' n = 1, 2, 6, In (xj/xi) and that a[_4 + 4 (In Xi/xi) 2 l 4. Solve the differential equation of Prob. 3 by the approximate method described in this chapter, and show that the solution has the same form as the exact solution. Show that the two solutions coincide in the limit of large a. THE STRING WITH VARIABLE TENSION AND DENSITY 159 6. A progressive wave travels on a uniform string which at x = is connected to a string whose density w> m = Mo 4-Â«av This second string is connected to a third at x = I which has the constant density m = mo + od and the whole is stretched with a uniform tension T. Using the approximate method, find the ratio of the amplitude of the wave transmitted in the third string to the original amplitude of the incident wave in the first string. 6. Consider a string of uniform density m, length L, but with a tension T which varies slightly from its average tension T . Show with the help of a perturbation calculation that the angular frequency of the nth mode is given approximately by L 2 n \ mrToJo dx L L ) .X 7. A uniform string of density mo, tension T, has a small load m placed at a; = a. Show that the frequency of the nth mode of vibration is approxi- mately given by 2 = VlEI 2Yi - â€” sin 2 â€”} â– L 2 mo\ Moi< L / Show that the effect of the additional load vanishes if it is placed at a node, and is biggest when at an antinode. 8. Show that the differential equation of Bessel's function J m is the same as that for a string of tension T = x, iw 2 = x - m 2 /x. Using the approxi- mate method developed for the vibration problem, show that approximately JM = 4 T Stant cos (JVl - m 2 /* 2 dx - Â«), Va; 2 â€” m 2 where x > m. 9. Using the approximation of Prob. 8 for J and Ji, compute the approxi- mation functions for a number of values of x, and show by a table of values how well these agree with the correct functions. Choose the arbitrary amplitude and phase factors to make the functions agree with the values of Jo and J i in the tables, for example making the zeros agree by adjusting a, and the maxima by adjusting the amplitude, taking such values as to get the best agreement possible for large z's. 10. Derive the differential Eq. (4) for A, in the approximate solution . iafy/n/T dx u = Ae CHAPTER XV THE VIBRATING MEMBRANE The problem of a vibrating membrane is very little more difficult in principle than the string. Let us take two coordinates, x and y, in the plane of the membrane, writing u for the displace- ment at right angles to the plane, so that we wish a, relation u = u(x, y, t). Consider a small element of the membrane, bounded by dx and dy. Let the mass per unit area be n, so that the mass of the element is ndxdy. Then its mass, times acceleration normal to the membrane, is n dx dy d 2 u/dt 2 . This is equal to the force arising from the tension. Let the tension be T. That is, if we cut the membrane along any line, the material on one side of the cut exerts a force on the material on the other, normal to the cut and equal to T for each unit of length of the cut. We assume that T is constant over the mem- brane. If the membrane were plane, the tension on its opposite edges would cancel, and we should have .no resultant force. If it is curved, however, we may proceed as follows. Along the edge at x + dx, the tension is at right angles to the y axis, almost along the x axis, but with a small component along the u direction, equal approximately to T[ â€” ) per unit of length, \OX/ x+dx or this times dy for the actual length dy. Similarly along the -<Â£). edge at x the component is â€” Ti-^-j dy, so that the sum is approximately T d 2 u/dx 2 dx dy. The forces acting along the edges at y and y -\- dy similarly add to T d 2 u/dy 2 dx dy, and the total force, the sum of these, is T (d 2 u/dx 2 + d 2 u/dy 2 ) dx dy. Thus the differential equation, dividing by dxdy, is d 2 u T fd 2 u d 2 u\ r ^ = T \dx~ 2 + W (1) 95. Boundary Conditions on the Rectangular Membrane. â€” A membrane is ordinarily held fast around a certain curve. In this way one can get a great variety of problems, by taking different curves. The two simplest are the rectangular mem- 160 THE VIBRATING MEMBRANE 161 brane, and the circular membrane, or ordinary drum, and in the present section we consider the rectangular case, assuming the membrane to be held at x = 0, x = X, y = 0, y = Y. We solve first by the exponential method, assuming 1/ = f,i(at+kx+lv) ^ Then the differential equation becomes â€” /wo 2 = â€” T(k 2 + I 2 ), co = \/T(k 2 + l 2 )/n, giving the angular velocity of the vibration in terms of the quantities k and I. Instead of the exponential solution we can equally well use sines or cosines. For example, with a given co, k, and /, we can take U = gioit/gikx+ily gâ€” ikx+ily gikxâ€”ily _|_ e â€”ikxâ€”ily\ = e ib>t {2i sin kx)(e ilv - e~ ily ) = â€” 4e'"' sin kx sin ly. As a matter of fact, this solution with sines is the one we want, since it reduces to zero when x = and y = 0. To apply the condition when x = X and y = Y, we must make the sines zero at these points, or must have sin kX = 0, sin IY = 0, or k = mr/X, I = mir/Y, where n, m are integers. In terms of these constants, we can then write (2) so that instead of having overtones whose frequencies are integral multiples of a fundamental, the frequencies are given by a much more complicated relation. There is one interesting result of this. Pleasing musical notes depend on having the frequencies of the overtones related in simple ways to the fundamental, so that they sound well together, as with a vibrating string. In a membrane or drum, in which these relations do not hold, the sound is far less musical than with a string. This suggests other cases, which do not exactly fall within the category of the present chapter. For example, a vibrating bell acts as a two-dimensional vibrating system, a little like a membrane, and has complicated overtones which in general are not harmonics. But it has been found by trial that if bells are made in their conventional shape, overtones are so adjusted that the loud ones are actually in tune with each other, though a slight change of shape would destroy the quality. 162 INTRODUCTION TO THEORETICAL PHYSICS 96. The Nodes in a Vibrating Membrane. â€” If the membrane is vibrating with one overtone, the amplitude will be zero along certain lines, which will stay at rest. These nodal lines form a rectangular network, coming when nx/X = 1, 2, â€¢ â€¢ â€¢ n â€” 1, and for my/Y = 1, 2, â€¢ â€¢ â€¢ m â€” 1, At any instant, if the membrane is displaced upward in one rectangle, it will be dis- placed downward in all adjacent rectangles. Such a nodal arrangement is characteristic of all sorts of standing wave problems 97. Initial Conditions. â€” At t = 0, we may wish to fix the shape and velocity of our membrane, obtaining initial conditions of the sort found with the string, and leading as before to Fourier series. For example, suppose the initial velocity is zero, the initial displacement a function f(x, y) . Then we must have "^n a x â€¢ mr% â– wry /on u = >â€¢ A nm cos Wnmt sin -y sin -~) (3) n,m where ca nm is given in Eq. (2), and where we have fix, y) = JSA nm sin -y sin -^- (4) n,m To find the coefficients A, the amplitudes of the various overtones necessary to satisfy the conditions, we must expand the function f(x, y) in a series of products of sines â€” a double Fourier series, as it is called. As in the last chapter, we assume that the expansion can be carried through, and ask only for the values of the coeffi- cients. Multiplying both sides of the equation by sin â€”==- sin y > where n', m' are definite integers, we integrate with respect to x from to X, and with respect to y from to Y. We find as before that I sin -^r- sin â€” ^- dx is zero unless n = n', and is Jo A A X/2 if n = n'. Thus the final result is Jo Jo It is worth noting that this is the first time we have had to use a double integral. If f(x, y) is a complicated function of the coordinates, it can, of course, be a very difficult problem actually to evaluate the integral. An'm' = YY I Jo ^ X ' ^ Sln ^1T Sln ^T^ dX dy ' ^ THE VIBRATING MEMBRANE 163 98. The Method of Separation of Variables.â€” To solve our differential equation, we may adopt a slightly different method, called the method of separation of variables, which does not directly depend oh the use of exponentials. It is a method for reducing the partial differential equation to a set of ordinary equations, and we shall find it very useful. In fact, it is so valu- able that practically the only partial differential equations which can be solved at all are those for which this method can be used. We wish to solve ^ = -( -^ + 3^2 )' Suppose we try to find a solution u which is the product of a function of x, a function of y, and a function of t; say u = P(x)Q(y)R(t), where P is a function of x to be determined, and so on. Of course, it is not obvious that one can find such a solution, but our experience would lead us to try it. If we substitute, we have, for example, du/dt = PQ dR/dt, and so on. If we denote dR/dt by R', with corresponding notation, we then have PQ R" = (T/n) (P ,f QR + P Q" R). Next we divide by PQR, obtaining RT = T(PZ Q^\ (6) We now make the step characteristic of the method of separa- tion of variables: we observe that the function R" /R on the left of Eq. (6) is a function of t alone, the quantity on the right a function of x and y alone. The equation then states that a certain function of t equals a function of x and y, whatever x, y, and t may be. But this is clearly impossible in general. If, for example, we keep x and y constant, and vary t, the left side would change, the right remaining constant, and the equation would not be satisfied. The only exception, as this example shows, is if the left side is a constant, independent of t, and similarly if the right side is a constant, independent of x and y. Let us then impose these conditions, letting the constant be â€” o> 2 (an arbi- trary constant so far, but later to be identified with our other co). We have then two equations, R 9 -R = -"â€¢ or R" + o> 2 R = 0, and $? + Â©â€”* 164 INTRODUCTION TO THEORETICAL PHYSICS Taking the latter equation, we may again separate. We write it T ~ -Q *T (8) The left side is a function of x, the right side of y, and by the same argument each is a constant, say â€” k 2 . Then we have P" + k 2 P = 0, and -k 2 = -(Q"/Q) - <Â» 2 {ii/T). We can rewrite this last Q"/Q = â€” I 2 , where -I 2 = k 2 - co 2 Â£, or co 2 = -{k 2 + I 2 ), (9) and it becomes Q" + l 2 Q = 0. We now have three ordinary differential equations for P, Q, and R, whose solutions are evidently p = e ikx (or e~ ikx , or sin kx, or cos fcx), Q = e ilv , R = e**, so that the final solution is as we found before, with the same relation between w, k, and I. 99. The Circular Membrane. â€” The differential equation for the circular membrane is the same as for the rectangular one, but the boundary condition is different : the displacement u is always zero on a circle of radius p about the origin. To solve the prob- lem, the simplest method is to introduce polar coordinates, r, 6; for then the boundary condition is that u = when r = p, which is a condition easy to apply. Let us then write our equa- tion in polar coordinates. Before doing it, we shall give the conventional names of the equations and symbols we meet. Our equation, which is often written d 2 u d 2 u _ 1 &u-_ n ,- n v Ix~ 2 + dy 2 v 2 dt 2 ~ U ' ' U; where v â€” y/T/p. is the velocity of the wave, is called the wave equation, for u represents waves, either progressive or standing. The special case d 2 u/dx 2 + d 2 u/dy 2 = 0, where u is independent of t, is called Laplace's equation. And the expression d 2 u/dx 2 + d 2 u/dy 2 , which we have already seen can be written in vector notation V 2 w, is called the Laplacian of u. Our present problem is to find the Laplacian in polar coordinates. 100. The Laplacian in Polar Coordinates. â€” Let us introduce r and by the equations x = r cos 6, y = r sin 0, r = \A 2 + V 2 > THE VIBRATING MEMBRANE 165 , v ' . A 5r a; 5r y a# â€” ydB x . 6 = tan" 1 -> so that -r- = -> -x- = -Â» x- = â€” ^> ^- = -5* and a; dx r dy r ox r L dy r z d 2 r _1 _x* d^r_l_y^fl = ^y jâ„¢ = ~2sy . ~dx 2 ~ r r^ , 'dy 2 ~ : r r 3 ' dx 2 ~ r 4 ' 3?/ 2 r 4 Then we have 5m _ du dr du a# ~dx ~~ ~dr ~dx + ~BB ~dx" If we apply this process again, we find without difficulty dhi _ dhi(dr\\ q d 2 u ( dr dd\ d 2 u( dd\ 2 dudh du d 2 6 dx 2 ~ lJr~ 2 \~dx) + drdd\dx dx) + dd 2 \dx) + dr dx 2 + dd dx 2 ' Proceeding similarly with y, and adding, we have ^u _,d^u = dMYdA 2 , / 3rV] , o^( â€” â€” -j- â€” â€” ^ + dx 2 + 3y 2 ar 2 L\ax/ + VW J drdd\dx dx "â€¢" dy dy/ VwyLW/ w/ J ar\az 2 + ay 2 / + aava* 2 + dy 2 / Substituting, this becomes d 2 u 1 d 2 u 1 du dr 2 7 2 ~dT 2 r~dr' which can also be written i a/ aA 4- ^â€” fiu r dr\ dr ) + r 2 dd 2 ' U; This is the expression for the Laplacian in polar coordinates. 101. Solution of the Differential Equation by Separation. â€” Our differential equation is now IJL( 0?Â±\ _lA Â§^a = Af^f (\<x\ r dr\ dr) + r 2 dd 2 v 2 dt 2 ' K } Let us solve by separation of variables, assuming u = R{r)Q(d)T{t). Then, substituting, and dividing by ROT, the result is ll^/^\^ld 2 = ll L d 2 r n R r dr\ dr) + r 2 6 dd 2 v 2 T dt 2 ' K } The problem is separated : the left side depends only on r and 0, the right on t. Each must then be a constant, which we shall call -a> 2 /v 2 , giving d 2 T/dt 2 + co 2 T = 0, T = A cos a + B sin a, 166 INTRODUCTION TO THEORETICAL PHYSICS and 1 1 d( dR\ , 1 ld 2 / diA J. 1 V dr/ + r 2 9 R rdr\ r 'dr/ ' r 2 6# We multiply by r 2 , and transfer the first term to the right, obtain- ing ld 2 9 Odd lRrdr\ dr J ^ v 2 \ Again the variables are separated, the left side depending only on 0, the right on r. Let each equal â€” ra 2 . Then d 2 Q/dd 2 -+- m 2 Q â€” 0, = C cos md + D sin md, and the equation for r can be immediately changed to This is just like BessePs equation (see Prob. 13, Chap. II), except that it has the constant co 2 /?; 2 in place of 1. A simple change of variables removes this discrepancy, however. Let x = ar/v. Then the equation becomes x dx\ or cancelling <a 2 /v 2 , it is exactly BesseFs equation i=('Â© + ( 1 -S>- - (15) The solution is then R = constant X J m (x), a Bessel's function of the wth order, whose expansion in power series we have already considered, for integral values of w, and for which we have found an approximation in the preceding chapter (Chap. XIV, Prob. 8). We shall see in the next section that only integral ra's must be used in the present problem. 102. Boundary Conditions. â€” Consider in the first place the solution for 0. At a given point of the membrane, the value of 6 is determined, but not in a single-valued way. Thus if the point corresponds to 6 = 47 deg., it would equally well correspond to 47 deg. + 360 deg., or 47 deg. + 720 deg., etc. Now 9 must surely have a definite value at each point of the membrane. Thus it must have the same value for 0, + 2t, + 4rr, etc. In other words, 9 is periodic in 6 with period 2ir. But this is true THE VIBRATING MEMBRANE 167 if, and only if, m is an integer. Hence our first condition, neces- sary to make the function single valued, is that m be an integer. Next consider the solution for r: R = J m (iÂ»r/v), where now m is an integer. At the edge of the membrane, u = 0, which means that R = 0, or J m (a>p/v) = 0. Now J m (x) is zero only for certain definite values of x, say x = x\, x 2 , Xz, â€¢ â€¢ * . From the properties of Bessel's functions, we have seen that there are an infinite number of such roots. Thus, to satisfy our boundary conditions, we must let up/v = xi, x 2 , â€¢ â€¢ â€¢ . The only adjustable quantity is <a, so that it must be determined m = 0,K=0 m-1,K-0 mÂ»2, K=0 Fig. m=0,K=l 22. â€” Nodes of mÂ»1,K=1 m-1,K*2 circular membrane. Shaded segments are displaced in opposite phase to unshaded. by one or another of the values w == vxi/p, ra 2 /p, â€¢ â€¢ â€¢ . Sup- pose in particular that w = v Xk/p, determined by the Mh root of J m . Then we should properly label it o} mk , since it depends on both these indices. We have thus determined our solution completely, except for the remaining arbitrary constants. These can be easily expressed in the following form : u = (A cos oimid + B sin u mk t) cos (md â€” a mk ) J m (w m kr/v). This is a particular solution. The general solutionis the sum of such terms, taken over all m's and A;'s. 103. Physical Nature of the Solution. â€” A single term corre- sponds to a single standing wave. Its nodes are concentric circles, values of r for which J m (oo mk r/v) is zero, of which of course the boundary is one; and radii, determined by cos (md â€” a) = 0, as in Fig. 22. It is readily seen that there are m radial 168 INTRODUCTION TO THEORETICAL PHYSICS nodes, k circular nodes without counting the boundary. The arbitrary constant a mk determines the angles at which the radial nodes are; changing it simply rotates the whole nodal pattern. The constants A and B determine the~amplitude and phase of the disturbance as a function of time. We may, if we choose, consider that there are two separate waves possible for each frequency, cos md J m and sin md J m . Such a case is called degener- ate; we shall see in a problem that the same thing is true of the square membrane. In a degenerate case, with two or more possible vibrations of the same frequency, it is plain that any linear combination of these vibrations gives a possible vibration of this same frequency. As with the rectangular membrane, the set of frequencies co m& does not form a simple set of overtones with pitches in harmonic relation to each other. 104. Initial Condition at t = 0. â€” Suppose we know that at t = 0, the displacement of the membrane is given by F(r, 0), and the velocity by G(r, 0). Now we can write the whole solu- tion, in a slightly more general way than before, u = ^[(Amk cos o> mk t + B m k sin oi mk t) cos md + m, k (C mk cos oi mk t + D mk sin o, mk t) sin md]J m y^â€”J. (16) Thus, writing displacement and velocity at t = 0, we have F(r, 0) = 2^ mit cos m0 + Cmk sin md ) J Arir) m, k G(r, 6) = 2 W ^ B ^ cos md + Dmk sin md ) J Arf~) m, k The A' a, B's, C's, D's must be chosen to fit these conditions. Both conditions are of the same sort. They require us to find the coefficients for expanding a function of r and 6 in series of products of sines and cosines and Bessel's functions. Now it proves to be true that both the sines or cosines and the Bessel's functions are orthogonal, and as a result of this we can make the expansions we desire in the usual way, as with Fourier series. Let us take the first equation, multiply by cos nd Jniunir/v), and integrate over the area of the drum. That is, we integrate with respect to r from to p, and with respect to 6 from to 2ir, and the element of area is rdrdd. Then we have THE VIBRATING MEMBRANE 169 f " f V(r, 0) cos n0 JnO^f) r dr dd = V f '(A** cos ra0 -f C mfc sin md) cos n0 d0 mk By the orthogonal property of the sine and cosine, the right side is zero unless m = n, giving y-p A nk \ r J n [ -^â€” J JJ â€”- J dr. k But now we shall prove in the next section that the J's are orthogonal in the sense that I r J n ( - !L - ) J n ( - !L - J = 0, if k ?Â£ I. Using this fact, our sum reduces to the single term t Ani I r Jâ€ž 2 (â€” J dr. If the last integral, which could be easily computed if we knew the properties of Bessel's functions better, were denoted by c n i, then we should have 1 A nl = â€”\ F(r, 0) cos nd J n [ â€” ) r dr dd, TTCnlJo Jo i?) (18) determining the coefficients A in terms of a single integral. Similarly we could get formulas for the B's, C's, D's. Of course, in an actual case, these integrals might be very difficult to com- pute, but nevertheless we have a general solution of our problem. . 105. Proof of Orthogonality of the J's. â€” We can prove the orthogonality of the J's directly from the differential equation, as was done in the last chapter for the nonuniform vibrating string. We wish to prove that Now we have Id r dr r dr dJ n (<Jnir/v) dr dJ n (caÂ» k r/v) dr (ani 2 n 2 \ T (u>nir\ _ n /co nk 2 n 2 \ T /wÂ»ifcA _ _ 170 INTRODUCTION TO THEORETICAL PHYSICS Multiply the first by r J n (o} nk r/v) } the second by r JÂ»(wÂ»jr/t/), subtract, and integrate from to p. The result is rw^)*[' fis ^]- j <^)4' ss ^]}* (Â«nfc 2 â€” 0>nl 2 \ C T ( w ntf*\ T ( 0) nk r\ , â€”irâ€”)jo rJ i-T) J \ir) dr - Just as in the last chapter, the left side can be shown to be zero, by integrating by parts. Then the right side must be zero, and either u nk 2 â€” w n j 2 is zero, which is not true unless k and I refer to the same overtone, or \\ J n (o} n ir/v) J n (oo nk r/v) = 0, which we wished to prove. The orthogonality is not quite of the form discussed in the last chapter, for the differential equation is of slightly different form, the quantity (a> 2 /v 2 â€” n 2 /r 2 ) r appearing in place of co 2 /x, so that the final result is not just like integrating n times the product of the functions to get zero. Problems 1. A rectangular drum is 20 by 40 cm., its whole mass is 100 gm., the total pull on the faces 50 and 100 kg., respectively. Find the frequencies, in cycles per second, of the five lowest modes of vibration, and sketch the nodes for each. 2. The special case of degeneracy arises when a rectangular membrane is square. Then the two modes of vibration e iat sin (rnrx/X) sin (rrnry/X) and e tot sin {mirx/X) sin (niry/X) have the same frequency (where we let X = Y). Thus any linear combination of these is a solution, again with this frequency. Consider the combinations e lC0t l A sin -^r- sin -~ + B sin -^r- sin -^? 1. Work out the nodes in the case n = 1, m = 2, for (1) B = A; (2) B = -A; (3) B = 2A. 3. A rectangular membrane is struck at its center, starting from rest, in such a way that at t = a small rectangular region about the center may be considered to have a velocity v, and the rest has no velocity. Find the amplitudes of the various overtones. 4. Imagine n and m plotted as two rectangular coordinates. Show that a curve of constant Â«, plotted in these coordinates, is an ellipse. Each integral value of n and m corresponds to an overtone, so that if we draw the point corresponding to each overtone, the number of points within such an ellipse gives the number of overtones with angular velocity less than Â«. Note that the number of such points per unit area of the plane is just one, and so find an approximate formula, using the area of the ellipse, for the number of overtones of frequency less than w, and also for the number THE VIBRATING MEMBRANE 171 between o> and u> + do>. Check up this approximation by the exact values of Prob. 1. 6. In the circular membrane, suppose that m = 0, and that k is very large, so that there are many circular nodes. Consider a small region near the edge of the membrane. The few nodes in this neighborhood will be almost straight lines, as if we were near the edge of a rectangular membrane. Find the asymptotic wave length, using the fact that J m (x) approaches cos (x â€” a) at large x, and show that the wave length is connected with the velocity and frequency in the usual manner. 6. Set up the wave equation in three-dimensional spherical coordinates, in which x = r sin cos <f>, y = r sin sin <f>, z = r cos 0. Show that it is Â±Â±( r *Â»Â»\ + 1 Â± /"sin **\ + r 2 dr V dr ) ^ r 2 sin dO \ d9 J ^ d 2 u 1 dhi r 2 sin 2 d<t> 2 v 2 dt 2 ' 7. Separate variables in the preceding equation. Show that the function of </> is sin m<t> or cos m<Â£, where m is an integer. Show that the equations for r and are respectively r 2 dr\ dr) ^ \v 2 r 2 ) ' where w, C are constants; 1 d ( . a de\ . ( n m 2 \ . 8. The equation for in Prob. 7 is called Legendre's equation. Let O = sin" 1 6 F(cos 0). Find the differential equation for F, solving in power series in cos 9, and show that the series breaks off if C = 1(1 + 1), where I is an integer. The resulting functions are called Fj m (cos 0), and are known as associated Legendre functions. Compute the first few Legehdre functions. 9. In the equation for r in Prob. 7, prove that R = ' + ^T% where x = tar /v. 10. Prove that two functions u n and u m , satisfying differential equations of the form ^[r(^]+ W ,W-/(#â€ž = o ( with different w n 's, but chosen so that both u n and u m are zero at x = and x = L, satisfy the orthogonality condition I n(x)uâ€ž(x)u m (x)dx = 0. CHAPTER XVI STRESSES, STRAINS, AND VIBRATIONS OF AN ELASTIC SOLID In the preceding chapters, we have been treating the vibra- tions of elastic strings and membranes, one- and two-dimensional bodies, and now we pass to the three-dimensional case, or the elastic solid. Of course, the strings and membranes were really- elastic solids, of particular shapes. But there are several ways in which we must give a more general treatment than we have previously done. First, in the strings and membranes, the rigidity of the material itself was not great enough to affect ihe vibration, whereas in the problems we now take up this rigidity, or the elastic properties of the material in general, will be important. Thus we may imagine all gradations of the prob- lem of a stretched wire, from the limiting case of a very thin long wire under large tension, when our previous theory is applicable, down to a short thick bar under small tension or even with no tension at all, when the restoring force on a particle, far from coming from the tension on the ends, comes from the distortion of the bar itself. Secondly, with the strings and membranes, we considered only transverse vibrations, while here we discuss longitudinal vibrations as well. Of course, strings can vibrate longitudinally, but we have so far neglected this phase of their motion. Thirdly, a very important part of the problems of strings and membranes has arisen from the fact that they were limited in space, the membranes being very thin pieces of material,, the strings thin in two dimensions. But while some of the problems of the present chapter have this property, we shall also consider vibrations and waves in extended media going, in the limiting case, to infinity in all dimensions, as sound waves in an infinite gas or solid. It is these sound waves which show the best analogy to our one- and two-dimensional- wave equations. 106. Stresses, Body and Surface Forces.â€” The first step in discussing the vibrations of an elastic solid, as with the string and 172 STRESSES, STRAINS, VIBRATIONS OF AN ELASTIC SOLID 173 membrane, is to find the force acting on an infinitesimal volume element, and to set this equal to mass times acceleration. The forces may be divided into two classes: (1) volume or body forces, such as gravity, which act on each volume element of the body, and which for the present we neglect, since we shall not use them in our applications; and (2) surface forces, with which neighboring parts of the medium act on each other, and which are transmitted across surfaces, or the forces transmitted across the bounding surface of the whole body. The tensions which we have met with string and membrane are examples of such forces, or pressures in a gas, or shearing forces in a twisted rod. To specify such a force, we imagine a surface element dA to be drawn somewhere in the body, with a normal n. The material on either side of dA exerts a force on the material on the other side; thus this force is a push normal to the surface if tbsre is a pressure in the body, it is a tension if that is the form of stress, or it may be a shearing force. The force exerted by the material on one side, on the material on the second side, and the other force exerted by the material on the second side back on the first side, are action and reaction, and are equal and opposite, so that one always has an ambiguity of sign in dealing with these forces, or as we call them stresses. We adopt the following convention : We imagine dA to be part of the surface bounding a volume, and n to be the outer normal. Then the force we deal with is the force exerted by the outside on the material inside the volume, over dA. Now this force will be a vector, and proportional to dA ; we call its x, y, and z components X n dA, Yâ€ždA, Z n dA, respectively. The capital letters indicate the force components, and the subscript n denotes not a com- ponent but the direction of the surface normal. The properties of a stress can be completely specified if we choose three unit areas at a point, one normal to each of the three coordinate axes, and give the components of the force acting across each. Thus for the surfaces normal to the x, y, and z axes, we have the three force vectors, or nine quantities, x x Y x z x Xy Yy Zy x z Y z z z . (1) We see in Fig. 23 the significance of the three components X x , Y x , Z x . This set of nine quantities forms the so-called stresi tensor. The diagonal terms of the array, X x , Y y , Z z , are called 174 INTRODUCTION TO THEORETICAL PHYSICS the normal stresses or pressures, since the force components act normal to the surface, and the remaining terms are called shearing or tangential stresses. It is easily shown that the force across an arbitrary surface which has direction cosines I, m, n for its normal has an x component IX x + mX y + nX z , with corresponding formulas for the other components. XÂ»dy dz ^ Zxd/eht dty Fig. 23. â€” Components of force acting across dydz. 107. Examples of Stresses. â€” The simplest stress is probably the hydrostatic pressure. There the force acting across a square centimeter is always at right angles to the area, and its magnitude is by definition the pressure P. The force acts into the body, and hence is of negative sign. We thus have X x = Y y = Z z = â€” P, all other components =0. A second example is a tension, say in the x direction. Then the unit area perpendicular to x has a force T exerted across it, normal to the area, but there is no force exerted across faces perpendicular to y or z. In other words, X x = T, all other components of the stress are zero. A third example is a shear. In Fig. 24 a, we have a cube of material, with equal and opposite tangential forces exerted across the faces normal to x, the forces acting in the y direction. Over the right face, the force exerted on the material is in the â€”y direction, so that for this face we have Y x = â€” S, a constant, and X x = Z x = 0. Over the opposite face, both force and direction of normal are reversed, so that the stress components are unchanged. But now we notice an important feature of shearing stress: the two forces we have mentioned exert a torque or couple on the cube, and if they were the only forces acting, it could not be in equilibrium. To get equilibrium, it proves necessary to have at the same time tangential forces exerted across the faces per- pendicular to the y axis, as in Fig. 246. These forces are equal in magnitude to the other, so that the torques obviously balance, and we have X v = Y x = â€”S, all other components equal to ..STRESSES, STRAINS, VIBRATIONS OF AN ELASTIC SOLID 175 zero. This property, that X y = Y x , proves to be general: the stress tensor is symmetrical about its diagonal. By making a proper rotation of axes, it is always possible to reduce a stress to diagonal form, in which no shearing stresses appear. Thus, in the case we have just considered, the problem is obviously symmetrical about the diagonal of the cube. In Fig. 24c, we take a surface element whose normal has direction cosines I = â€” 1/a/2, rn = 1/V%, n = 0, giving a force exerted across it of components â€”S/y/2, S/\/2, 0, or a force of magni- tude Â£ normal to the surface. Similarly in Fig. 24d, we have a y y i i Y x =-S s y y Y x -*= â€” x^-s \ Y x =-S V > f Yx=-S 1 "2 (a) (d) (b) (c) Fig. 24.â€” Diagram of shearing stress, (a) Shear over the faces perpendicular to the x axis. (6) Additional shear over faces perpendicular to the y axis, necessary to balance the turning moment of the shear indicated in (a). (c) and (d) Stress system of (b) referred to principal axes, tension in (c) , pres- sure in (d). surface at right angles, and find again a force normal to the sur- face, but now of magnitude â€” S, Thus, if we take as new axes the two 45-deg. diagonals in the xy plane, and the z axis, the stress consists of a tension S along one axis, negative tension (or pressure-like force) at right angles, and zero stress across the face normal to z. Axes of this sort, in which each face has a pure pressure- or tension-like force across it, and no shear, are called principal axes of stress. * 108. The Equation of Motion. â€” Let us find the force on a small element of volume, having sides dx, dy, dz. Over the face at x + dx, there will be a force X x (x + dx), Y x (x + dx), Z x (x + dx) per unit area. Similarly exerted over the face at x there will be a force â€”X x (x), â€”Y x (x), â€”Z x (x). The x component of the dX resulting force is X x (x + dx) â€” X x (x) = -Q-^dx per unit area, ^ DEPARTMENT OF CHEMISTRY LIVERPOOL COLLEGE OF TECHNOLOGY 176 INTRODUCTION TO THEORETICAL PHYSICS ay or -â€”dxdydz for the area dydz. The y and z components are ~~dxdydz and -â€”-dxdydz, respectively. In the same way we can find the three components of force exerted over each of the two other pairs of faces. Adding, we have for the total x com- ponent of force \~^ + ~? + ^Jd x dydz. Thus, if v x , v y , v z are the components of velocity of the solid at the point in ques- tion, the equations of motion, remembering that the mass of our small volume is pdxdydz, are dX x dX y dX z __ dv x dx dy dz ^tt dY x dYy .BY, = dvy dx + dy + dz p dt dZ x dZy dZz _ dp? dx ^ dy ^ dz ~ p dt' W These equations are evidently simply the generalization of those used previously with the string and membrane. Thus with the membrane let the z axis be normal to the plane of the membrane. We consider then only the third equation, giving velocity along z. The. stress is a tension along the membrane, and if we cut the membrane with a surface perpendicular to x, we see that, if the membrane is inclined so that it makes an angle a with the x axis, there will be a component Z x , a force in the direction to produce acceleration, equal to Ta. If then a = du/dx, where u is the displacement along z, the first term becomes T(d 2 u/dx 2 ), as we found before. Similarly the second term is T(d 2 u/dy 2 ), and the third is zero, yielding the equation of vibration which we have already used. 109. Transverse Waves. â€” Two sorts of waves are possible in an elastic solid: transverse waves, in which the displacement is at right angles to the direction of propagation of the waves, and longitudinal waves, as the sound waves in a gas, in which the displacement is in the direction of propagation. We consider first transverse waves. * Rather than taking the general case, which involves rather complicated formulas, we assume that our wave is being propagated along the x axis, and that the displace- ment of the particles is in the y direction. We shall expect to g ft a wave equation involving only x derivatives, not y or z, and STRESSES, STRAINS, VIBRATIONS OF AN ELASTIC SOLID 111 having as solutions either progressive or standing waves. Let the displacement of a particle in the y direction be rj; since the wave is being propagated along the x axis, we assume that it has wave fronts normal to x, such that every point on a wave front has the same displacement, and this means that 77 is a function of x only. We may then consider a thin sheet or lamina, as that between x and x + dx in Fig. 25. Let us suppose that the two points which in the unstrained medium were at x and x + dx, y = 0, are displaced to the points P and P', at distances 77O) and -nix + dx), respectively, from the axis. Then evidently the lamina has been sheared, and we must find the relation between the shearing stress and the strain (that is, displacement) which Fig. 25. â€” Shear in a transverse plane wave. it has produced. The type of stress is evidently the sort described in Fig. 24. The material to the right of x + dx exerts across unit cross section of the face a force in the y direction, equal to Y x (or X y ) . But now Hooke's law says that the actual deformation of the material, or the strain, is proportional to the stress acting. In this particular case, the deformation is a shearing one, and is opposed by the rigidity of the medium (which is the reason why a liquid, having no rigidity, cannot have transverse waves). The deformation is given in terms of the coefficient of rigidity n as follows : the strain, measured by the tangent of the angle which the line PP' makes with the x axis, is equal to the shearing stress divided by /x- In other words, Y x = n drj/dx. Substituting this relation between stress and strain in the equations of motion, we have at once d( dr\ dv v TxVTx) = P W 178 INTRODUCTION TO THEORETICAL PHYSICS or, writing v y = drj/dt, d 2 7) dx 2 P d 2 V H dt 2 ' (3) the one-dimensional wave equation, representing transverse waves propagated with the velocity -y/vjp, or the square root of elastic modulus divided by density. Of course, we should have got the three-dimensional wave equation if we had considered propagation in an arbitrary direction. 110. Longitudinal Waves. â€” Here again we consider propaga- tion along the x direction. In Fig. 26, let the displacement of a particle in the x direction be Â£(x), a function of x only. Evidently the stress in this case is a pure tension, positive or negative, so x+dx jfrjga Fig. 26. â€” Compression and rarefaction in a longitudinal plane wave. that the force across unit cross section is a pull in the x direction, equal to X x . Hooke's law now states that the tension is propor- tional to the strain; and in particular, that it is proportional to the change in thickness of the lamina [which is evidently Â£ (x + dx) â€” Â£(x)] divided by the thickness. The constant of propor- tionality in this case is not one of the simple elastic constants; it proves to be written (X + 2/x), where X is an elastic constant whose physical meaning is not easy to state. Perhaps as good an interpretation of X as any is simply to define it from this particu- lar sort of deformation. We now have X x = (X + 2/z) â€” , all other components of stress = 0, so that from the equations of motion we at once have STRESSES, STRAINS, VIBRATIONS OF AN ELASTIC SOLID 179 dx 2 X + 2 M dt 2 K J again a wave equation, representing a longitudinal wave traveling with velocity -\Z(\ + 2/z)/p, different from the velocity of the transverse wave. 111. General Wave Propagation. â€” In the two preceding sec- tions, we have derived two very specialized waves which can be propagated in an elastic solid, plane longitudinal and trans- verse waves traveling along the x axis. Of course, much more complicated waves are possible, and if we were discussing the problem completely, we should set up the three-dimensional 1 d 2 u wave equation, of the form V 2 u = -$ -^-j and derive general wave solutions. We should have separate equations for the longitudinal and transverse waves, generalizations of Eqs. (3) and (4). As we shall learn later when discussing optical problems, such a wave equation has as solutions not merely plane waves traveling in all arbitrary directions, but also FlG - .l 7 ; -1 ^? 1 * tr j* ns 7 e / se wave ' u â– ' with longitudinal reflected wave. spherical waves diverging from point sources, and many more complicated types of waves. All these are possible in an elastic solid. In our discussion of the plane waves, we separated the longitudinal and transverse waves entirely, allowing one type to exist without the other, but unfortunately in general this cannot be done. For instance, when a wave of one type is reflected from a surface, then unless the reflection is at normal incidence, longitudinal motion will generally be partly converted into transverse, and vice versa. In Fig. 27, we show diagrammatically how this could be, the transverse motion in the incident wave evidently being in such a direction as to be partly transformed into longitudinal motion in the reflected wave. For this reason, the complete treatment of the vibrations of an elastic solid is a very complicated problem. An example is found in geophysical problems, where one is interested in the propagation of earthquake waves through the earth. This case is made even more difficult by the fact 180 INTRODUCTION TO THEORETICAL PHYSICS that the elastic properties of the earth change as a function of depth, so that one must use solutions of the form we have dis- cussed in Chap. XIV, in connection with strings whose prop- erties depend on position. There is one application of the theory of the waves in an elastic solid which has at least historical interest. When it was dis- covered that light was a transverse wave motion, it was attempted to identify these waves with the transverse vibrations of an elastic solid, the ether. The general properties, and even some of the details, as the quantitative laws giving the fraction of light reflected and transmitted at a boundary, were correctly worked out, the reflection being treated by analogy with our discussion of reflection of waves in strings at a point of discontinuity of density, in Chap. XIV. But the difficulty, which could not be overcome, was that of eliminating the longitudinal waves, which certainly do not occur in optics, but which were inherent in the elastic solid theory. This difficulty does not occur in the present electromagnetic theory, where only transverse waves are allowed by the fundamental differential equations. This lack of longitudinal waves makes the problem of optical wave motion on the whole simpler than that of elastic waves. 112. Strains and Hooke's Law. â€” In discussing transverse and longitudinal elastic waves, we had to introduce certain elastic constants, measuring the ratio between stress components, and certain quantities measuring the strain or deformation of the substance. The fact that these strains were proportional to the stresses is Hooke's law, the fundamental law of elasticity, holding for sufficiently small strains. It is now worth while to state the general relation between stress and strain, though we shall not go through the proof. To begin with, we imagine the body unstrained. Then in the process of deformation, we imagine that the particle originally at x, y, z has been displaced to a point x + Â£, y + -q, z + f â€¢ The three quantities Â£, ij, Â£ are functions of x, y, z, and are the three components of a vector. We meet, in other words, a vector (which we may call the displacement), which is a function of position. Such a vector field reminds us of a force field, as a gravitational- or electric-force field, where the force vector on Unit mass or charge, respectively, is a function of position. We shall meet such vector fields often in the future. Now, the displacement is not the same thing as the strain; the body might STRESSES, STRAINS, VIBRATIONS OF AN ELASTIC SOLID 181 be displaced bodily, without involving any stress or strain at all. It is only when the displacement of one side of a small element of volume is different from the other, so that the element is distorted in size or shape, that we have a strain. In other words, the essential quantities in determining the strain are the derivatives of Â£, 77, f with respect to x, y, z. We have already seen two examples: with the shear in the transverse wave, the strain was b-q/bx, and in the compressional wave the strain was b%/bx. In the two cases mentioned, the stress was proportional to the corresponding partial derivative, and Hooke's law means that this is true in general, in the form that the components of stress are linear functions of the partial derivatives of the components of displacement. There are nine components of stress, of which six are independent (remembering that X y â€” Y x , etc.), and similarly there are nine partial derivatives of displace- ment, of which it can be proved that six again are independent. This would mean six linear equations, with thirty-six coefficients, which would act as elastic constants. In the most general type of substance, a completely anisotropic crystal, it can be shown that twenty-one of these really are independent, giving a tre- mendous number of elastic constants. With isotropic substances showing no crystalline structure, however, most of these con- stants are either zero or can be written in terms of each other, and there are only two independent constants, the X and /j. which we have already met ; all other elastic constants, as Young's modulus and the compressibility, can be written in terms of them. Using these constants, the relations between stress and strain prove to have the following form : x x -Â«, + x)fI + x* + xÂ£ Xy (b$ b V \ Yy dÂ£ bn df = x s + (2m + x)|| + x^ Y, = \Tz + -by) z z = Hr + *tt + < 2 " + x )f bx by bz z x = \ai + Tz) (5) In the cases we have taken up already, we have seen two illustra- tions of these equations: with transverse waves, bri/bx was the only partial derivative different from zero, and we had X v = n bri/bx; with the longitudinal wave, bÂ£/bx was the only term different from zero, and as we see this gives X x = (2/x + X) b%/bx, as we had before, but also Y v = Z z = X b%/bx. These latter 182 INTRODUCTION TO THEORETICAL PHYSICS stress components, however, since they do not depend on y or z, do not contribute to the equations of motion, as we see by refer- ring back to these. 113. Young's Modulus. â€” To illustrate the use of the equations connecting stress and strain, we shall discuss the stretching of a wire. â– Let the wire be stretched along the x axis, and let the stress be a pure tension T, so that X x = T, and all other stress components are zero. The x, y, z axes are principal axes for this stress, and it can be shown that the strain has principal axes, too, parallel to those of stress, so that the last three equa- tions, for X y , etc., do not enter. We are left, then, with the three equations .Â°-^ + ^ + Â«> + Â»& Subtracting the thi&l from the second, we have dy/dy = dÂ£/dz. Using this relation, either the second or third gives drj/dy = where a = t^t â€” ; â€” N > and is called Poisson's ratio. 2(X + m) Since X and //. are always positive, it is obvious that Poisson's ratio is never greater than %. We have found, then, that as the wire is stretched (positive d%/dx), it contracts sidewise, (negative drj/dy and d^/dz) and the ratio of sidewise contraction per unit width, to lengthwise stretch per unit length, is given by Poisson's ratio. Actual materials have Poisson's ratio of the order of magnitude of %. Now we put this expression back in the first equation, obtaining T = (2/t + X â€” 2Xcr) dÂ£/dx. The elastic modulus (2jt + X â€” 2X<r), giving the tension, or force per unit area, divided by the elongation per unit length, is called Young's modulus, and is denoted by E. In the prob- lems we find other ways of writing the relations between Young's modulus, Poisson's ratio, and the other elastic constants. It is worth noticing that Young's modulus was not the elastic constant which entered into the velocity of compressional waves. If we had longitudinal waves traveling down a wire, the wire would contract laterally at those points where it was under tension, expand when it was under compression, as given by -â€¢(Â§) STRESSES, STRAINS, VIBRATIONS OP AN ELASTIC SOLID 183 Poisson's ratio, and for such a wave the velocity would be determined from Young's modulus. But in our extended medium, we did not allow the possibility of the lateral motion connected with such a contraction and expansion, since in a medium of large dimensions compared with the wave length this would amount to a very large transverse motion. We assumed instead that the motion was purely longitudinal, and found that we had to assume the existence of other lateral stresses, tensions Y y and Z z , to counteract the tendency to expansion and contraction. These stresses changed the condi- tions of the problem, and in particular the elastic modulus concerned in the velocity of propagation of the wave. Problems 1. In Fig. 28, let the normal to the inclined face of the prism have direc- tion cosines I, m, n. Compute the total forces exerted by an arbitrary stress Fig. 28. â€” Prism for computing force exerted by stresses across a face with arbitrary normal n. on the prism, and prove that the net force is zero, and the prism is in equilib- rium, only if the force per unit area over the face perpendicular to n has x component IX X + mX y + nX z , etc. 2. Rotate coordinates to reduce an arbitrary stress to principal axes. Carry through the problem of the pure shear, discussed in Fig. 24, as an illustration of the general method. 3. Prove that in terms of Young's modulus and Poisson's ratio we have E* _ E X = 71â€” ; wl S-V 2 M = (1 + <r)(l -2a) M 1 +Â» 4. Assume a body is under pure hydrostatic pressure P. Show that the distortion is a decrease of all dimensions by a fixed fraction. Show that the fractional change in volume is d$/dx + dr,/dy + d?/dz. Using this, show that the compressibility k of a solid under hydrostatic pressure, which 184 INTRODUCTION TO THEORETICAL PHYSICS Fig. 29. â€” Bent beam. by definition is the fractional decrease of volume divided by the pressure, equals 3(1 - 2a) /E. 5. Show that the velocity of a longitudinal wave in a fluid, for which p. is zero, is l/y/icp, where k is the compressibility. 6. A rectangular beam held at one end is bent into an arc of a circle, the radius of curvature of its central section being R. Find the stress distribution throughout the beam, showing that the beam will be kept in equilibrium by a torque or couple of the sort K indicated. Show that for a given torque the curvature of the beam is inversely proportional to ab 3 E, where E is Young's modulus (seeFig.29). 7. A circular cylinder of height h rests in equilibrium under the action of gravity. Take a coordinate system with the xy plane in the top base of the cylinder and the positive z axis pointing downward. Show that the only com- ponent of stress different from zero is Z z = â€”pgz, if p is the density of the cylinder. Using Hooke's law show that the strains are dÂ£/dx = dv/dy = (<r/E)pgz, and df/dz = ~(1/E)pgz, and find the other partial derivatives. Integrate these expressions to find the com- ponents of the displacement of any point of the medium, remembering that the strains are partial derivatives. Show that a horizontal plane section of the cylinder becomes a paraboloid of rotation due to the deformation. Show that the radius of the cylinder increases from top to bottom when it is thus deformed. 8. A spherical shell of inner radius Ri, outer radius R 2 , contains a fluid of pressure Pi, and is immersed in a second fluid of lower pressure P 2 . It can be shown that the displacements of points on account of the pressure are given by $ = x(A + B/r 3 ), v = y(A + B/r 3 ), f = z(A + B/r 3 ). Verify these values by computing the stresses at any point, substituting in the equations of motion, and showing that they result in equilibrium. Show further that the force across an area normal to the radius is itself normal to the surface, so that the stress within the sphere can be balanced by hydro- static pressures within and without. 9. In the shell of Prob. 8, determine A and B so that the pressure will have the proper values at Ri and P 2 . Discuss the stress within the shell, showing that the principal axes at any point are along the radius and two arbitrary directions at right angles, and find the tension or pressure along the directions at right angles, discussing the final result physically, with special reference to possible breaking of the shell under excessive pressure inside. CHAPTER XVII FLOW OF FLUIDS In the last chapter we discussed the equation of motion of an elastic body where there was no mass motion or flow. Now we pass to hydrodynamics and the flow of fluids. Much of what we say, however, applies to flow in generalâ€” such as heat flow, which we shall take up in the next chapterâ€” and even to such a different subject as electrostatics. The feature in common in all these prob- lems is the existence of a vector field. By that we mean a vector defined at each point of space. We have already met such a field in our general discussion of forces and potentials in Chap. VI, for the force is defined at every point of space and forms a vector field. In the present case the vector is the velocity of the flowing fluid, or the closely related flux density. With heat flow it is again a flux density for the flowing heat, and for electricity the electric field. All these problems, though so different physically, are thus mathematically similar and can be treated by the same analytical methods. 114. Velocity, Flux Density, and Lines of Flow. â€” At every point of a flowing medium, we can define the velocity, a vector (the time rate of change of the displacement, which we used in the last chapter, and to which we assigned components Â£, 17, Â£)â€¢ Also we can give the density p, and both p and v are in general func- tions of position (x, y, z) and of time. We may now ask, How much material will flow across any area per second? This total flow across a surface is called a flux. In Fig. 30, we con- sider an infinitesimal surface element dS. With dS as a base we erect a prism, the slant height being the velocity v, which in general is not normal to dS. Evidently the material in the prism will just be that which crosses dS in one second, since in 185 Fig. 30. â€” Flux through an area dS. 186 INTRODUCTION TO THEORETICAL PHYSICS this time it will move a distance v, and fill the dotted prism. But this is p (the density) times the volume of the prism (the base dS times the altitude t/Â», where n is the normal to the surface), or pvâ€ždS. The quantity pv is called the flux density, and we may denote it by /. Then for a finite area, the total flux will be the sum of the contributions from all the surface elements, or a sur- face integral fff n dS = jjpv n dS. In some kinds of flow, such as heat flow, there is an analogue to the flux, but not to the density and velocity separately, so that one regards the flux density as being the more fundamental vector field. We can draw lines through the medium, tangent at every point to the direction of flow at that point. These are called lines of flow. Similarly we can set up tubes of flow, the elements of their surfaces being lines of flow. We can imagine the substance to flow through these tubes, as water flows through a pipe, never passing outside, since the velocity is always tangential to the surface of the tube. In hydrodynamics these lines of flow are called streamlines, and the sort of flow in which they are inde- pendent of time is called streamline flow. 115. The Equation of Continuity. â€” Consider a fixed volume in a flowing fluid. The amount of fluid in the volume is fffpdv, and this can change in two ways. First, liquid can flow into the volume over the surfaces. Secondly, it may be possible for liquid to be produced within the volume without having flowed in. For instance, in a swimming pool, for all practical purposes we may consider the opening of the inlet pipe as a region where fluid is appearing, and the outlet as a place where it is disappear- ing. Such regions are called sources and sinks, respectively. Then we have ni -rj-dv = rate of inflow over the surface + at rate of production inside. Now we have just seen that the rate of flow over any surface, or flux, is JjfndS. This represents outflow if n is the outer nor- mal to a closed surface, so that we must change sign to get inflow. If in addition we assume that the rate of production of material per unit volume is P, we have FLOW OF FLUIDS 187 the volume integrals being over the whole region we are consider- ing, the surface integral over the surface enclosing this volume. If we now apply our equation to an infinitesimal volume in the form of a rectangular parallelopiped, bounded by x, x + dx, y,y + dy, z,z + dz, we can put the equation in a form not involv- ing integrals. The flow to the right (into the volume) over the face x is f x (x)dydz. The net flow over that at x + dx is f x (x + dx)dydz = f x (x)dydz + â€”f x (x)dydz â€¢ â€¢ â€¢ . Thus the total inflow over the faces is â€” â€”(f x )dxdydz. Adding ox similar contributions from the other faces we have for the total inflow -jjus = -(If, + Â±f. + !/.)*Â»** - â€” (v â€¢ f)dv â€” â€” div / dv, where the divergence is a vector operator discussed in Chap. VI. Hence fÂ£=-div/ + P. (1) This is often called the equation of continuity. We may note several special cases. If there is no production of fluid in dv, it becomes ^ + div/ = 0, or using / = pv, ^ + div (pv) = 0. (2) Again, in a steady state, where density is independent of time, div/ -P. (3) This equation shows the physical meaning of the divergence of a vector: it measures the rate of production of the flowing sub- stance, per unit volume. Finally, if no substance is being pro- duced at the point in question, and density is independent of time, div / = 0, and we have a divergenceless flow. 116. Gauss's Theorem. â€” We have proved that the amount of substance flowing out of a small volume dxdydz = dv per second equals div / dv in steady flow. Suppose now that we have a large 188 INTRODUCTION TO THEORETICAL PHYSICS volume and that we wish to find the total amount flowing out of it per second. This is simply the sum of the amounts flowing from each element. Thus it is a volume integral, ///div / dv. On the other hand, the material all flows through the surface, so that the rate of outflow is Jjf n dS. These two expressions must be equal: JJJdivfdv = SJfndS. (4) This is Gauss's theorem, and it- holds for any vector / which is a function of position. 117. Lines of Flow to Measure Rate of Flow. â€” Let us set up a definite number of lines of flow, so that the number crossing a unit area perpendicular to the flow is numerically equal to the magnitude of the flux density. We could surely do this, but we might have the necessity of sometimes letting lines start or stop, to keep the right number. We can prove, however, that with a divergenceless flow this would not be necessary. The lines start or stop only at places where the divergence is different from zero : that is, they start at sources, stop at sinks. For an elementary proof, let us take a short section of a tube of flow, bounded by two surfaces normal to the flow. Let one of them have an area Ai, the other A 2 , and let the magnitude of the flux over the one face be f h over the other / 2 . Then the total current in over one face isfiAi, and out over the other is f 2 A 2 . If the flow is diver- genceless, these are equal. But the number of lines per unit area on the first is f h so that the number cutting the one end of the tube is fiAi, and the number emerging at the other end is/ 2 A 2 . Since these are equal, no lines are lost or start within. In other words, in a divergenceless flow, lines never start or stop except at sources or sinks. For a more general proof we note that the number of lines crossing a surface element dS, by definition, is f n dS. Then the number emerging from a closed surface, and which therefore have started within the surface, is ///â€ž dS. But by Gauss's theorem this is J//div / dv, and is zero if the flow is divergenceless. 118. Irrotational Flow and the Velocity Potential. â€” In Chap. VI we studied vector fields like our flux vector; we were interested then in forces. We saw that under certain conditions, a force could be written as a gradient of a potential function. The condition was that the work done in taking a particle around any closed path should be zero, or that the field should be conserva- FLOW OF FLUIDS 189 tive: JF â€¢ ds = around any contour. We had another way of stating the condition: it was curl F = everywhere. In a similar way, if the curl of our velocity vector is zero, we can introduce a potential function here. It is now to be regarded as a purely mathematical device, used simply by analogy with our previous cases, and having nothing to do with potential energy. A flow whose curl is everywhere zero is called an irrotational flow. It is easy to prove that in a whirlpool the curl is different from Fig. 31. â€” Lines of flow and equipotentials for flow about a cylinder. Full lines indicate lines of flow, dotted lines equipotentials. In a corresponding electrical problem with charges distributed within the cylinder, and placed in a uniform external electric field, the dotted lines would be lines of force, full lines equipotentials. zero (see for instance Prob. 4, Chap. VI), a nonvanishing curl indicating in fact exactly a whirlpool. Now, physically, we are acquainted with two sorts of fluid flow: streamline flow and turbulent flow. In the latter, eddies or whirlpools form, and the curl of the velocity is not zero. But in the former, there are no eddies, the curl of the velocity is zero, and the flow is irrotational. In a streamline flow, then, we can introduce a potential function, called the velocity potential <f>, defined byv = â€” grad <t>. The velocity potential, of course, is not a potential energy; its analogy with potential energies is mathematical rather then physical. Nevertheless, we can draw surfaces of constant velocity potential, 190 INTRODUCTION TO THEORETICAL PHYSICS or equipotentials, and the lines of flow will cut the equipotentials at right angles. Using the equation of continuity, and assuming that p is constant, we have as the general equation for the velocity potential div (pv) = -p div grad 4> = -pv 2 4> = â€” â– Â£ + P. (4) reducing to Laplace's equation v 2 <Â£ â€” for a steady state where there are no sources or sinks. The introduction of a velocity potential satisfying Laplace's equation makes it possible in many cases to solve hydrodynamic problems by analogy with similar problems in other branches of physics, as electrostatics. In Chap. XIX we shall find that the electrostatic potential satisfies Laplace's equation, the lines of force being normal to the equipotentials, so that any set of electro- static equipotentials can be used for a suitable hydrodynamic problem. For instance, in Fig. 31, we show the lines of flow and equipotentials for flow of a liquid about a cylinder. The same lines, however, represent lines of force resulting from a certain distriBution of charges in the center of the sphere, superposed on a uniform electric field. 119. Euler's Equations of Motion for Ideal Fluids. â€” The equa- tion of continuity serves to determine the velocity of flow of a liquid, but does not determine the pressures, or make any connection with forces. It is essentially a kinematical rather than a dynamical law. It is one of two fundamental equations governing fluid motion. The other is essentially the Newtonian law, force equals mass times acceleration. For a continuous medium, we have already seen how this is to be formulated in the preceding chapter, where we wrote the force on an element of volume in terms of the stresses. As was mentioned in the last chapter, an ideal fluid is characterized by the fact that it supports no shear and hence n = 0. For this case the six stress compo- nents reduce to one, namely X x = Y v = Z z = â€” p and X y = Y z = Z x = 0, if p denotes the pressure in the fluid. Further- more, if there is flow of the fluid one must consider the velocity of each particle as a function of x, y, z, and t, and hence dv x dv x . dv x , dv x , dv x -dt = -di+ V *-dx- + V Â«ly- + v *to and two similar expressions for v y and v s . Written in vector form with the help of our symbolic vector V = grad FLOW OF FLUIDS 191 if " IF + (c ' v) "" " IF + ( " â€¢ grad)o " i.e., we form the scalar product of v and V and then operate on v x . Our general equations of motion become in this case : *-%- 4% +<â– -***â€¢} where X, Y, Z represent the body force (as gravitation) per unit mass, which we neglected in the last chapter. Combined into one vector equation this gives F grad p = â€” + (v â€¢ grad>, (5) p at where F is the body force. These are the Euler equations of hydrodynamics. In them p (the density) is considered a known function of the pressure as given by the equation of state of the substance. We then have p, v x , v y , v z as functions of x, y, z, and t. The three equations above and the continuity equation provide the necessary four equations to give a unique solution. For the case of hydrostatic equilibrium, these equations reduce to the form F = (1/p) grad p, from which such familiar things as Archimedes' principle immediately follow. 120. Irrotational Flow and Bernoulli's Equation. â€” If there is irrotational flow, and the velocity is derived from a velocity potential, Euler's equations take a particularly simple form. If v = â€” grad <t>, then we have (v â€¢ grad)^ = â€”(v- grad â€”J d<t> d 2 <f> d$ ay d(j> d 2 <Â£ dx dx 2 by dxdy dz dxdz so that 2dx[\dx) + \dy) + \dz)\' (v - grad> Â« grad ( |- JÂ» 192 INTRODUCTION TO THEORETICAL PHYSICS in the special case where curl v = 0. Further, we introduce a â€”Ap whose gradient is grad II = -T- grad p = - grad p. Euler's equation for the steady state, where v is independent of time, then becomes F = grad (-0 As a result of this equation, we see that for irrotational flow to occur, F must be the gradient of a certain quantity, or F must be a conservative force, derivable from a potential. We may then se t F = â€” grad V, and Euler's equation becomes grad (f +11 + = 0, or, integrated, v 2 V + II + ~- = constant. This is Bernoulli's equation. For the special case of an incom- pressible fluid, p is independent of p, so that n is equal to -â€¢ In that case the equation may be written pV + p + \pv l â€” constant. Bernoulli's equation is essentially an energy integral, the term P V representing the potential energy per unit volume, p the contribution to the energy resulting from the pressure, and |py 2 the kinetic energy per unit volume. As we have stated, Bernoulli's equation, supplemented for a compressible fluid by the relation giving density as function of pressure, determines the pressure at each point of space, when the velocity and external potential are known. For instance, if there is no external force field (V = 0), we see that the pressure decreases at points where the velocity is high, which means at points where the tubes of flow narrow down. 121. Viscous Fluids. â€” In Sec. 119 we mentioned the fact that ideal fluids support no shearing stresses. This, however, is not true of viscous fluids. Imagine a viscous liquid flowing hori- FLOW OF FLUIDS 193 zontally, the lower layers dragging along the bottom, and the velocity increasing with height, so that v x = v x (y), other compo- nents of v are zero, if the xz plane is horizontal, y is vertical. Then if we imagine a horizontal element of area in the liquid at a certain height, the material above the element of area will pull tangentially on the material below it on account of viscosity, thus exerting a shearing stress. Experimentally, this stress, which is X V1 is proportional to the rate of increase of horizontal component of velocity with height: if k is the coefficient of viscosity, X v = k-~- This is a special case of the general laws governing stresses in a viscous medium, connecting the stresses with the rates of change of the velocity components with position. In the last chapter we have given the general form of Hooke's law, the law giving stresses in an elastic medium in terms of the strains. By analogy we can set up the relations for a viscous fluid, but now the stresses are proportional, not to the strain components themselves, but to their time derivatives. By comparison with Eq. (5), Chap. XVI, we see that k takes the place of the shear modulus, and that the component of strain dÂ£/dy + drj/dx must be replaced by its time derivative, dv x /dy + dVy/dx = dv x /dy in our special case, since v y = 0. This tells us how in general we are to change Hooke's law for the case of viscous incompressible fluids. We place dv x /dx + dVy/dy + dvjdz = divv = 0, corresponding to dÂ£/ dx + dy/dy + dÂ£/dz = in the strains, replace n by k and insert the time deriva- tives of the strain components. Thus -we have the following relations between the stress and strain components for liquids : r.--p + *Â£ ; . r, = *(t + t) *--* + Â»Â£ *.-<!? + Sr) <Â« where we have included the ordinary pressure of the liquid in addition to the viscous stresses. Inserting the values of the stress components in the equations of motion (2) of the previous chapter and remembering that for an incompressible fluid we have the continuity equation div v = dv x /dx + dv y /dy -f- 194 INTRODUCTION TO THEORETICAL PHYSICS dv z /dz = 0, there follow the general equations of motion for viscous liquids: "-% + **.-<% or in vector form: P F - grad p + ky 2 v = p-Â£, differing from Eq. (5) by the term ky 2 v. 122. Poiseuille's Law. â€” Suppose we have an incompressible liquid flowing in a steady state in a horizontal cylinder of radius R parallel to the long axis of the cylinder (x axis). We have v v = v z = and since there are no body forces X = Y = Z = 0. The equation of continuity becomes dv x /dx = so that v x is a function of y and z alone. Then dv x /dt = v x dv x /dx + v y dv x /dy + v z dv x /dz = 0. Furthermore, if we take the divergence of the fundamental equations of motion, we have : P div F â€” div grad p + &v 2 (div v) = p -r (div v) Now by the equation of continuity div v = 0, and in our case of no external forces this reduces to div grad p = v 2 2> = 0. In our problem dp/dy = dp/dz = 0, so that d 2 p/dx 2 = 0. The pressure is thus a linear function of x, so that we have a constant pressure gradient in the tube. Of the three equations, only the first is left : dp dx and since dp/dx is constant = a, and we have cylindrical sym- metry, this reduces to 1 d_ ( dv x \ _ a r dr\ dr / k where r is the distance from the axis of the cylinder. Integrated, this yields v x = jrr 2 + 6 In r + c, and since v x is finite for - T,( Q2Vx _L d * Vx \ FLOW OF FLUIDS 195 r = o, & = 0. If the liquid clings to the walls of the cylinder, v x = when r = R, so that we find v x = Â±{r*-m. (8) Thus the liquid flows in cylindrical tubes of constant velocity. This type of motion is called "laminar" motion. The velocity varies parabolically across a diameter of the cylinder. The amount of liquid flowing per second through a cylindrical ring of thickness dr, radius r, is dQ = 2irrv x dr so that the total discharge rate of such a cylinder is Q = 2.J rv. dr = â€” ^- = m (p, - Pl ) (9) where we have placed the constant pressure gradient a = â€” PLZLPJ. This law, known as Poiseuille's law, furnishes a Li very nice experimental method of determining the coefficient of viscosity of liquids. Problems 1. Liquid is confined between two parallel plates, so that it flows in two dimensions. At a certain point, a pipe discharges liquid at a constant rate into the region. Find the velocity potential, and velocity, as a function of position. Show by direct calculation that the flow outward over any circle about the source is the same. 2. A shallow tray containing fluid has a source at one point, an equal sink at another, so that liquid flows in two dimensions from source to sink. Find the equation of the equipotentials and the lines of flow, prove they are circles and plot them. (Suggestion: since the equations are linear, the potential or flux due to two sources is the sum of the solution for the separate sources.) 3. Prove that â€” (1/r) is a solution of Laplace's equation. Investigate dx the lines of flow connected with this as a potential. Draw the lines, in the xy plane. What sort of physical situation would be described by this case? 4. Consider an ideal fluid at rest. It is subjected to an impulsive pressure (p) = I pdt , where r indicates the interval of time during which the pressure is applied. If no body forces act on the fluid, prove by integrating Euler's equations, that the impulsive pressure divided by the density of the fluid equals the velocity potential of the ensuing motion. This is the physical significance of a velocity potential. 196 INTRODUCTION TO THEORETICAL PHYSICS 5. Show for a liquid in equilibrium under the action of gravity that the pressure varies linearly with the depth below the surface. Calculate the total force exerted on the surface of a submerged body by the liquid and show that the resultant force is directed upwards and is given in magnitude by Archimedes' principle. [Hint: If a vector has only one component different from zero, e.g., A x , then Gauss's theorem becomes f ^ dV = Ca x cos (n, x)dS.] 6. The free surface of a liquid is one of constant pressure. If an incom- pressible fluid is placed in a cylindrical vessel and the whole rotated with constant angular velocity co, show that the free surface becomes a paraboloid of revolution. (Hint: Introduce a fictitious potential energy to take care of centrifugal force and use the hydrostatic equations.) 7. A gas maintained at constant pressure p, flows steadily out of a small hole into the atmosphere, pressure p . Assume the density constant. Find the expressions for the velocity of efflux and for the force exerted on the gas container due to the efflux. If the gas is oxygen at a pressure of 4 atmos- pheres in the tank, calculate the efflux velocity (1) with the density constant, and (2) taking into account the variation of density with pressure, assuming an adiabatic expansion. 8. With the help of Gauss's theorem prove the theorem of the last chapter that the stress tensor is symmetric. 9. Calculate the rate of discharge of a cylindrical pipe standing vertically, the liquid flowing in laminar flow under the action of gravity only. 10. A perfect gas at constant temperature is in equilibrium under the action of gravity. Find the relation between the pressure of the gas and the height above the surface of the earth. 11. Carry through the derivation of the laws of motion of viscous fluids using the modified form of Hooke's law and the general equations of motion of an. elastic medium. CHAPTER XVIII HEAT FLOW The problem of heat flow, although of quite different physical nature from elasticity and hydrodynamics, involves similar mathematics. Indeed, Fourier was concerned with problems of heat flow when he developed the series known by his name which we have used so much in our study of vibrations. First we set up the differential equation governing heat flow in a manner similar to the reasoning of the preceding chapters. 123. Differential Equation of Heat Flow. â€” The fundamental physical fact is that when there is a difference of temperature in a material body, heat will flow, and the rate of flow is proportional to the temperature gradient. Suppose we have a slab of thick- ness L, area a, with a difference of temperature Ti â€” Ti between the faces. Then the amount of heat flowing per second across the face is ~ â€” , where k is the thermal conductivity, the negative sign meaning that if Ti > T h the flow will be back- ward toward low temperature. In the limit of an infinitely thin slab, this is simply â€” kaâ€” , if Â£ is the coordinate measured in the ox direction of the heat flow. Next, there is the fact that if heat flows into a region, its temperature rises, the amount of rise being given by the relation that the amount of heat flowing in equals the change of temperature times the heat capacity, which in turn is the specific heat c times the mass. Putting these together, we obtain an equation which states the following: the rate of heat flow into a body is proportional to the time rate of change of its temperature; or, looking at it in another way, it is proportional to the temperature gradient around its boundaries. By eliminat- ing the heat flow, we obtain a differential equation for the temperature. Our first principle, which we have stated in the form that a/77 â€” kaâ€” measures the heat flow across the area a perpendicular to the x axis, is evidently a special case of the general law that 197 198 INTRODUCTION TO THEORETICAL PHYSICS the flux density of heat flow is / = â€” k grad T. This incidentally shows us at once that, if A; is a constant, / is derivable from a potential, in this case kT, so that the curl of the flux is zero. The surfaces of constant temperature are called isothermals, and they serve as equipotentials, the lines of flow being at right angles to the isothermals. The equation of continuity now states that the time rate of increase of heat per unit volume equals the rate at which the heat flows in over the surface, plus the rate at which heat is produced inside. To raise the temperature of unit volume one degree requires an amount of heat equal to the heat capacity, or cp, if c is the specific heat, p the density of matter. Thus the time rate of increase of heat is cp times the time rate of increase of temperature. We have then where P means the rate of production of heat per unit volume. By Gauss's theorem, the second term becomes â€” / / Jdiv / dv, so that for a small volume we have Cp ~dt = ~ div / + ' P - Substituting, cp^ = k div grad T + P = kV*T + P. (1) This is the equation of heat flow. At a point where heat is not being produced, it reduces to an equation similar to the wave equation as far as the dependence on space is concerned. It contains, however, a first rather than a second time derivative, and this results in solutions which are exponentially damped, like a particle with resistance but no restoring force, rather than oscillating solutions. The particular case where the temperature is independent of the time, the steady state, leads simply to Laplace's equation, the term in time vanishing. 124. The Steady Flow of Heat. â€” The isothermals and lines of flow for the steady flow of heat are determined from Laplace's equation, and in some elementary cases we can find them with great ease. First let us consider a one-dimensional flow, which we HEAT FLOW 199 obtain with a slab of a substance, like a window pane, assuming that the temperature varies only with the coordinate x normal to the surface, being independent of y and z. Laplace's equation becomes d 2 T/dx 2 = 0, so that T = a + bx, with a constant temperature gradient. Thus if a face at x = is kept at tem- perature T , the other face at x = L at T h the temperature at intermediate points is given by T = T + (x/L){T x â€” T ). It is this simple case which furnishes the basis for the usual defini- tion of thermal conductivity. The cylinder forms a slightly more difficult problem in steady flow. For instance, let us ask for the steady state of temperature within a pipe formed of two concentric cylinders, whose inside and outside faces are kept at fixed temperatures. The tempera- ture will depend only on r, and will be determined, on account of the divergenceless nature of the flow, by the condition that the same amount of heat flows across the surface of any cylinder with radius intermediate between r and r\, the minimum and maxi- mum radii of the pipe. This amount of heat is the product of the normal component of the flow, which is f r = â€” k(dT/dr), by the area of the cylinder, which for unit length along the pipe is 2xr. In other words, 2wrf r = â€”2irkr{dT/dr) = constant, dT/dr = a/r, T = a In r + b. The two constants can be determined by fitting the temperatures at the two surfaces of the pipe. This example is interesting in showing that the tempera- ture gradient is not always a constant in the steady state. The reason is very simple : the tubes of flow are not of constant cross- sectional area, and thus with a divergenceless flow the number of lines of flow per square centimeter, and consequently the magni- tude of the temperature gradient and flux vector, must change from point to point. The same thing is evident in the flow of heat in a sphere, where the flow through concentric spheres must be the same. Hence, since the areas of these spheres increase proportionally to the squares of the radii, the temperature gradi- ent must be inversely proportional to the square of the distance from the center, and the temperature inversely as the first power. These relations are just like those of the field and potential of a point charge in electrostatics, and as we shall later see, for just the same reason: both are solutions of Laplace's equation. 125. Flow Vectors in Generalized Coordinates. â€” Complicated problems in the steady flow of heat, as in hydrodynamics and electrostatics, are best approached by introducing curvilinear 200 INTRODUCTION TO THEORETICAL PHYSICS coordinates, so that the boundaries of the bodies are expressed by coordinate surfaces, as with the cylinder and sphere. Thie suggests the formulation of the equation of steady flow, or Laplace's equation, in such general coordinates. Let the coor- dinates be g x , g 2 , qz and let them be orthogonal coordinates, so that the three sets of coordinate surfaces, q\ = constant, g 2 = constant, g 3 = constant, intersect at right angles. Now let us move a distance dsi normal to a surface q\ = constant. By doing so, g 2 and g 3 do not change, but we reach another surface on which gi has increased by dqi, which in general is different from dsi. Thus, with polar coordinates, if the displacement is along the radius, so that r is changing, ds = dr; but if it is along a tangent to a circle, so that 6 is changing, ds = rdd. In general, we have dqi = hidsi, dq 2 = h 2 ds 2 , dq z = h s dsz, (3) where in polar coordinates the h connected with r is unity, but that connected with 6 is 1/r. The first step in setting up vector operations in any set of coordinates is to derive these A's, which can be done by elementary geometrical methods. 126. Gradient in Generalized Coordinates. â€” The component of the gradient of a scalar S in any direction is its directional derivative in that direction. Thus the component in the direc- tion 1 (normal to the surface gi = constant) is -r- = hi- â€” For in- dsi dqi stance, in polar coordinates, the r component is -z-> and the 6 com- + d< *5 dr ponent 1 dS Fig. 32. â€” Element of volume for vector operations in curvilinear coordinates. r dd 127. Divergence in Generalized Coordinates. â€” Let us apply Gauss's theorem to a small volume element dV = dsids^dsz, bounded by coordinate surfaces at q lt q x + dq h etc. as in Fig. 32. If we have a vector A, of components A h A 2 , A 3 along the three curvilinear axes, the flux into the volume over the face at q h whose area is ds 2 ds s , is (Aids^dss)^, and the corresponding flux out over the opposite face is (Aidszdssj^+dqj, where we note that the area ds 2 ds 3 changes with q x as well as the flux density A\. Thus the flux out over these two faces is ^â€” (Aids 2 dsz)dqi = dqi HEAT FLOW 201 â€¢-â€” ( tâ€”J- )dqidq 2 dqz = hjiji-i-râ€”i â€” |- )dV. Proceeding similarly dqAJiJiz/ dqi\h 2 li3/ with the other pairs of faces, and setting the whole outward flux equal to div A dV, we have *, a = Â«44(^) + 4(^) + UM <*> 128. Laplacian. â€” Writing the Laplacian as div grad <Â£, and placing Ax = gradi </>, etc., in the expression for div A, we have *â™¦ = div grad * = M*[^ ^J + ^ ^J + ^yd^Jj (5) It can easily be verified that this formula leads to the same values for the Laplacian in special cases which we have already obtained by direct differentiation in Chap. XV. But now we can under- stand the formula better, for we see that the terms like hi/h 2 h 3 appearing inside the first differentiation arise from the fact that the flux through the opposite sides of a volume may differ not only on account of variation of the flux density, but also because the Opposite sides can have different areas, as they do in the small volume element determined by coordinate surfaces with curvi- linear coordinates. 129. Steady Flow of Heat in a Sphere. â€” Having obtained Laplace's equation in arbitrary coordinate systems, the problem of solving for the steady flow of heat becomes that of solving Laplace's equation in a suitable system, subject to certain bound- ary conditions. For instance, suppose we know that the surface of a sphere, radius r , is kept at a temperature independent of time, though depending on the angles 6 and <Â£. We then can set up the steady distribution of temperature within the sphere by solving Laplace's equation in spherical coordinates. The problem is mathematically like that of Problems 6, 7, and 8, Chap. XV, the vibration of a sphere, if we seek a solution inde- pendent of time. Just as in those problems, we separate vari- ables in Laplace's equation, obtaining solutions of the form sin m4>Pi m (cos 6)R, where the P's are called associated Legendre polynomials, and where R satisfies the equation ll( r2 dR\ r 2 dr\ dr / 2 dR\ 1(1 + 1) R = 0, 202 INTRODUCTION TO THEORETICAL PHYSICS which can be immediately solved by setting R = r n , where n is an integer to be determined. Substituting, this leads at once to the equation n(n + 1) = l{l + 1), which has two solutions, n = I or n = â€” (I + 1). In the present case, where the function must stay finite within the sphere, at r = 0, we cannot have inverse powers, so that the only allowable functions are r l . Other problems solved by the same method, however, as for instance those of the electrostatic fields of distributions of charges, often involve functions which may become infinite at r = but remain finite at large r's, and they must be expanded in the series of inverse powers. We now have for a general solution ^^(Ami sin m<f> + B m i cos m<Â£)Pr(cos d)r l . I m To get the coefficients of the various terms in the sum, we set r = r , and determine the coefficients so that the resulting func- tion of and <Â£ is the assumed temperature distribution. This amounts to an expansion of the assumed function in series in the orthogonal functions (sin m<f> or cos m<Â£)Pj TO (cos 0), and can be done by the usual methods for such expansions. 130. Spherical Harmonics. â€” To understand the physical meaning of the various terms of the expansion, we should con- sider the spherical harmonics, or functions of angles. Solving for these as in the problems quoted above, we find for the first few functions the following values: I = 0, m = "0: constant I = l } m = Â± 1 : (sin 4> or cos </>) sin m = 0: cos I = 2, m â€” Â±2: (sin 2<j> or cos 2<t>) sin 2 m = + 1 : (sin <t> or cos 4>) sin cos m = 0: 3 cos 2 0-1. These functions are shown graphically in Fig. 33, where the intersections of the nodal planes or cones with unit sphere are drawn. Thus the functions with I = 1 have one nodal plane, which may be perpendicular to any one of the three coordinate axes. This is seen most easily by remembering that x â€” r sin 6 cos <f>, y = r sin sin <f>, z = r cos 0, so that the three solutions of the problem corresponding to I = 1 (r times the functions of angle) are simply x, y, z. These are obviously solutions of Laplace\s ^nation, and have the nodal planes HEAT FLOW 203 x = 0, y = 0, z = 0, respectively. Similarly by making linear combinations of these three functions, we obtain solutions having any desired nodal plane. This is analogous to the degeneracy in the circular membrane, discussed in Sec. 103. With I = 2, there are two nodal surfaces, and so on. For discussing the vibrations of a sphere, of course these nodes would represent the regions of no displacement, the material on one side. being displaced one way, the material on the other side in the opposite m=Â±1 m = Fig. 33. â€” Spherical harmonics. Figures represent nodal lines on the surface of a sphere, for the functions sin m^Pf 1 (cos 6) and cos m<t>Pi m (cos 0). Upper line, 1=1; lower line, 1=2. direction. With heat flow, the separate terms represent simple types of steady temperature distribution. For instance, the terms with 1 = 1 represent spheres in which the surface tempera- ture varies as the cosine of the colatitude angle, or as the distance in a direction along the axis, and our solution tells us that in this case the temperature within the body varies linearly with distance, as in a flat slab. Higher terms represent more compli- cated solutions, and by superposing them any desired steady heat flow can be built up. 131. Fourier's Method for the Transient Flow of Heat. â€” The simplest type of problem in the transient flow of heat is the following: At t = 0, a body has a temperature which is an arbitrary function of position. At that instant, it is plunged into a cooling bath of some sort, which instantly cools its sur- faces to a fixed distribution of surface temperature which is 204 INTRODUCTION TO THEORETICAL PHYSICS maintained after that. The problem is to find the temperature throughout the body as a function of time as it cools from its initial to its final steady state. This can be easily reduced to a simpler case. We write the temperature at any time as the sum of two terms, the transient solution, and the steady-state solu- tion. The latter is the temperature distribution set up by the cooling baths around the surface, and is discussed as in the last few sections in which steady flow of heat has been considered. The transient solution starts off with a temperature distribution which, added to the steady-state solution, gives the assumed initial temperature distribution of the body, and then gradually damps down to zero, finally leaving the steady-state solution only. Since at any instant after t = the steady-state solution by itself gives the correct boundary temperature about the sur- face of the body, we see that the transient must give zero tempera- ture at all points of the surface, independent of time. Thus the transient by itself is the solution of the problem in which a body is heated to an arbitrary temperature distribution at t = 0, after that is plunged into a cooling bath maintaining its whole surface at temperature zero, and gradually cools down to this temperature. We investigate this transient problem. First we take the one-dimensional case, again of a slab, in which the initial temperature is an arbitrary function of x, but at all times after t = the two faces, at x = and x = L, are maintained at T = 0. The heat-flow equation becomes d*T _cpdT = A dT = cp 'dx T ~ k dt A dt k We solve this equation by separation of variables. If T = X{x)Q{t), and if we substitute in the equation and divide by T, we have 1 d*X = AdG = _ C2 X dx 2 6 dt Then separating we have dQ CPQ <PX dt^ A ' dx ^ + ^ = o,^+c*x = o. The solutions are -en 6 = e A , X = sin Cx or cos Cx. We see that the temperature decreases exponentially with the time, approaching a constant value, a very reasonable behavior. HEAT FLOW 205 The boundary condition is now T = when x â€” 0, x = L, and we satisfy this as we would with the vibrating string: we take only sines, and only those which reduce to zero &t x â€” L; that is, we take sin (nwx/L), where n is an integer. In other words, C = mr/L, so that the function is constant X e~ {n ' v ' /AL2)t sin (mrx/L), and the whole solution, writing in the value of A, is ^^ zr n "*\ â– nirX ,a\ Let us assume that the temperature distribution at I = is T = f(x). Then we wish to find the coefficients K n , deter- mining the temperature at later times. At t = the exponentials go to 1, so that we have f(x) =- ^jK n sin -j â€” We can then find the coefficients K n by Fourier's method, so that the problem is solved. The qualitative nature of the solution is easy to see. The original shape of the temperature curve will be distorted as time goes on, since the terms with high n damp down more rapidly than the others. After a certain lapse of time the whole slab will have become cooler, but also with a more simple tem- perature distribution, approximating the single term with n = 1. Thus, for instance, if it is originally all at a constant high tem- perature, and then is cooled, the original temperature curve would rise discontinuously from at the edge to a constant value T inside. But after a time the curve would be like a single loop of a sine curve, showing that the edges would cool more rapidly than the middle. The transient flow of heat in bodies of other shape may be considered by extensions of the same method. Thus the transient flow in the cylinder or sphere can be handled by introducing cylindrical or spherical polar coordinates, and separating vari- ables just as for the vibration problems. The solutions, as far as the coordinates are concerned, come out as with vibrations, leading, for example, to sines and cosines of the angle, and Bessel's functions of r, in the case of two-dimensional flow in a circle or cylinder, but the time enters as a real exponential damping down to zero, rather than a complex exponential or sinusoidal function. Special cases are discussed in the problems. 132. Integral Method for Heat Flow.â€” There is another, differ- ent, method of great use in discussing the transient flow of heat. 206 INTRODUCTION TO THEORETICAL PHYSICS This method is based on an important particular solution of the heat-flow equation. If we consider again the one-dimensional flow, and let a 2 = k/cp, we can easily show that the function f(x - x', t) = 4:aH (7) 2a-\/irt is a solution of the equation, where x' is an arbitrary constant. To prove this, it is only necessary to substitute in the differential Fig. 34.â€” Function f(x - x', t) of Eq. (7), as function of a;, for different Vs. The function represents temperature distribution at different times resulting from initial conditions where the temperature is infinite at x' , zero- elsewhere. equation. The graph of the function /, plotted against x for different values of t, as in Fig. 34, has a sharp maximum at x = x', looking like the familiar Gauss curve for probability distributions. At t = the curve is coincident with the x axis everywhere except at x = x' , where it forms an infinitely high and narrow mountain, so that the area under the curve is finite. As time goes on, this mountain becomes flatter and broader, until finally the function is zero everywhere. The function / can be used to discuss the following problem : At t = the temperature throughout an infinite body is given by a function T (x), and we are interested in the way in which this temperature distribution changes with time. We can break up the problem into a sum of other simpler problems, by dividing up HEAT FLOW 207 the distance x into small intervals, by a succession of points x h x 2 â€¢ â€¢ â€¢ x n . We set up the following problems: 1. The initial temperature is T (x ) between x and x h but is zero elsewhere; 2. The initial temperature is T (xi) between, xi and x 2 , but is zero elsewhere; n. The initial temperature is T (x n -i) between z n _i and x n , but is zero elsewhere. The initial temperature distribution connected with one of these problems would be similar to the curve of Fig. 34, for very small value of t, in that it would be large in a very small region, negligible or zero elsewhere. To make the maximum come at the right place, we must choose x' for the ith problem equal to x t . As time goes on, the function / gives a good approximation to the way in which the temperature in this simple problem changes. Now if, at t = 0, we add together all the temperatures of Probs. 1 to n, we get the correct initial distribution of temperature. Therefore, if we add all the solutions at a later time, we again get the solution for the whole problem. This, of course, actually becomes an integral, the element of the integrand connected with the interval dxi, which equals x i+ i - x if being proportional to T (xi)f(x - Xi, t)dxi. As a matter of fact, the constant of proportionality in / is so chosen that this gives just the right answer: T(x, t) = /^ T (x') f{x - x' } t) dx'. (8) To prove this, we need to do two things: first, prove that it is a solution of the heat-flow equation; secondly, show that it approaches the correct value at t = 0. The first is obvious, for the integrand, regarded as a function of x and t, has already been shown to be a solution of the equation, and on account of the linear nature of the differential equation a sum of solutions is a solution. For the second, we note that at t = the function fix â€” x' , t) has appreciable values only at x = x'. The whole integral will then come from the immediate neighborhood of x' = x, so that we may insert this value in T , and take it outside the integral sign, obtaining T(x, 0) = T ix) f^fix - x' t 0) dx'. 208 INTRODUCTION TO THEORETICAL PHYSICS X i* jo x â€” x The integral is â€” j=. I e ~ u ' 1 du, where u = â€ž ~ i and this equals unity. Hence we have shown that T{x, 0) = T (x), so that we have verified our solution. By a slight variation, it is possible to solve the problem in which the temperature of a semi-infinite slab bounded by x = is initially any desired value, and in which the surface is kept at T = at all subsequent times. Let the initial temperature be T (x), where this function is defined only for positive x's, inside the slab. We now define an odd function equal to T (x) for positive x's, equal therefore to â€” T ( â€” x) for negative x's. If we set up an infinite slab with this temperature distribution, then on account of symmetry the temperature at x = will always be zero, and our boundary condition is satisfied, the part of the solution for positive x's being the desired function. Integral methods similar to that described can be used also to discuss the problem in which the surface of a semi-infinite slab is kept at a temperature which varies in an arbitrary way with time. Two- and three-dimensional problems can also be treated, though the principles are not essentially different from those already considered. One interesting feature of heat flow is brought out by the integral solution which we have just used. That is its irreversi- ble nature. Thermodynamically, heat conduction is a typical irreversible process, and this is shown in the fact that heat always flows from the warmer to the cooler body, never in the opposite direction. With reversible processes, as for instance vibration problems, one can change the sign of the time where it appears in the solution and still have a possible solution of the equation ; a vibration running backward is not essentially different from one running forward. But that is not the case in the heat-flow equation, as we see easily from Eq. (7), where, if we attempt to give t a negative value, the solution becomes imaginary. The essential mathematical difference between the two cases is that in heat flow a first time derivative appears, while in vibration problems and wave equations there is a second time derivative. This second time derivative is unchanged when t is changed to â€”t, whereas the first time derivative in the heat-flow equation changes sign with t, so that, if a given function satisfies the equation, it will no longer satisfy it if time is reversed. HEAT FLOW 209 Problems 1. Derive the divergence, gradient, and Laplacian in spherical polar coordinates by the general method of this chapter. 2. Discuss the steady flow of heat in a spherical shell contained between two concentric spheres, the temperature being an arbitrary function of position over both surfaces. 3. Discuss the steady two-dimensional flow of heat in a semi-infinite rectangular bar bounded by x = 0, x = L, y = 0, extending to infinity along the y axis, subject to the boundary condition that the temperature is zero along the two infinite sides of the bar, but that it is an arbitrary function of x along the end from x = to x = L. Build up the solution out of individual solutions varying sinusoidally with x, and exponentially with y, noting that they must decrease rather than increase exponentially as y increases. 4. Discuss the steady flow of heat in a semi-infinite cylindrical rod with a flat end, if the temperature is kept at zero along the cylindrical face, but is an arbitrary function of position on the end. 5. A slab is heated to a uniform temperature Ti, then plunged in a bath which keeps its temperature at TV Find the interior temperature as a function of the time, computing and drawing several graphs, so chosen as to show the progress of the cooling process. 6. For small times after the cooling process has commenced in Prob. 5 ? the interior temperature will not have changed appreciably, and the slab will act practically like a semi-infinite slab. Compare the solution of Prob. 5, using Fourier's method, with the corresponding solution by the integral method, computing both curves and comparing. 7. In an infinite body the temperature is initially unity between the planes x = â€” 1 and x = 1, and is zero everywhere else. Plot the tempera- ture as a function of x for several instants of time, and finally for t = Â«> . e-Â» 2 dtt.) 8. Prove that the integral f Â°Â° e~ ui du = %^-- (Suggestion : Multiply this integral by the equal integral [ e~ v2 dv, and consider u and v as Cartesian coordinates in a plane. Introduce polar coordinates in the plane, carrying out the integration in those coordinates.) 9. Show that a particular integral of the equation for heat flow in an infini + e medium is constant c ~I^, where r is the distance from the origin. Discuss the initial temperature distribution corresponding to this solution. 10. Show that the integral 1 f /â€¢ /â€¢ _' 2 T = â– &3%sn rMT w "v* is a general solution of the heat-flow equation in three dimensions corre- sponding to an initial temperature distribution of T (x, y, z), where r 2 = (x - xV + (y - y') 2 + (2 - *') 2 . CHAPTER XIX ELECTROSTATICS, GREEN'S THEOREM; AND POTENTIAL THEORY The problems of electrostatics are practically identical mathe- matically with those of flow, which we have been considering in the last few chapters. The fundamental physical law is very simple. Electric charges exert forces on each other, given by Coulomb's law, which states that the force is directed along the line of centers, and equal to ee'/r 2 , where e and e' are the strengths of the charges, r the distance between. The force on a particular charge is then given as the sum of the individual attractions and repulsions exerted by all the other charges. The force per unit charge at any point is the intensity of the electric field, a vector function of position. The lines tangent to the force vector, similar to the lines of flow in the last two chapters, are called the lines of force. 133. The Divergence of the Field.â€” Consider the field of a point charge at the origin of coordinates. The field intensity E is a vector of magnitude e/r 2 , pointing out along the radius; its components are thus ex ey ez We then have div E = ~(â€”\ + JL(?M\ -l JL(?*\ = dx\r 3 J dy\r 3 J dz\r 3 ) 3 3(*> + W)-l J3 \ r 3 We thus see that the field of a point charge is divergenceless. In other words, if we represent the field strength by the number of lines of force per square centimeter, these lines will never start or stop in empty space. They will, of course, start or stop on charges. We cannot see this directly, but we can prove it by using Gauss's theorem. Take a small sphere of radius R about the origin. Then we know that the volume integral of the divergence of E over the volume equals the surface integral 210 GREEN'S THEOREM, AND POTENTIAL THEORY 211 of the normal component of E. This component is e/R 2 , and the surface area is 4ir# 2 , so that the surface integral in question is 4xe. Thus the volume integral of the divergence over our small volume is iire, which is different from zero. Since the number of lines emerging across an area equals the field strength, the total number of lines of force diverging from the charge e is also ire. Now consider the field of many point charges. The field of each charge separately has zero divergence. Therefore, since the divergence of the sum of several functions is the sum of the divergences, it is plain that the divergence of the whole field vanishes: div E = in general. The only exception is for those points where there is charge, for there we have seen that the divergence does not vanish. Let us see what does happen there. In the first place we introduce p, the volume density of charge. Now take a small volume dv, containing a charge pdv. Surely if dv is small enough this field will be just as if the same charges were concentrated at a point. Thus 4wpdv lines will diverge from the charge, or JfE n dS = div E dv = lirpdv. Dividing by dv, we have div E = 4tt P . (1) This is the general equation for the divergence of the field, and we see that it reduces to div E = at points where the charge density vanishes. This equation, div E = 4xp, is mathemati- cally equivalent to the continuity equation ^=-div/ + P, at if we set the time derivative equal to zero, and consider 4irp as the quantity analogous to the rate of production of material. Here, of course, there is no actual idea of flow, the analogy being merely mathematical. 134. The Potential. â€” We can immediately show that the curl of the field of a point charge vanishes. And unlike the divergence equation, this is true everywhere, even right at the charge. Then, -if we superpose many charges, the curl still is zero, so that we have the general equation curl E = 0. This holds in all static cases (we shall later have a term to add to the equation, contain- ing a time derivative). Thus we can always set up an electro- static potential <f> t such that E = - grad <Â£. Taking the divergence, we find the equation which the potential satisfies: it is 212 INTRODUCTION TO THEORETICAL PHYSICS -div grad <j> = -V 2 <f> = 4rp, (2) which is called Poisson's equation. Laplace's equation V 2 <Â£ = is the special case which holds in those regions of space that contain no charge. If we form the line integral of the electric field intensity along a given curve between two points of the field, A and B, then JE â– ds along this curve is called the electromotive force along the path. It is obviously the work per unit charge done by the field when a charge is moved along the given path from A to B. In the electrostatic case, since E can be obtained from a potential, E = â€”grad 4> and rB r*B E.m.f. = I E â€¢ ds = â€” I grad <l> â€¢ ds = -f(s* + s*ts*) -S. B so that in this case the e.m.f. is equal to the potential difference between the points A and B. The distinction between e.m.f. and potential difference is of importance in cases where curl E 5* and hence there is no potential. Even in this case we may still use the idea of e.m.f. 135. Electrostatic Problems without Conductors. â€” There are two principal sorts of electrostatic problems. The first is that in which we know the distribution of charge, and wish to compute the field. We could always do this by direct summation of the fields due to the individual charges, but often that is very difficult, and we can simplify greatly by using the potential and Laplace's equation. Thus suppose we have charge uniformly distributed over an infinite plane, the amount per unit area being <r, and suppose we wish the field at a distance R from that plane. We may get this by a direct calculation. Thus we take a set of polar coordinates in the plane, which center at the point directly beneath the place where we wish the potential, as in Fig. 35. Between the circles of radius r and r + dr, and between and 6 -f dd, will be an amount of charge ardddr. This will be at a distance \/R 2 + r 2 from the point we are interested in, so that its field will have the magnitude â€ž 2 â€” ^ The component normal GREEN'S THEOREM, AND POTENTIAL THEORY 213 R to the plane, which is all that we need, is this times aRrdddr (R 2 + r 2 ) 3 ^ \/R 2 + r* The total field is then x dx (i + x*y< where x = - B - K Fig. 35. â€” Field of a charged plane. From charge between r and r + dr, 6 and + d$: ardddr E n = aRrdddr R* + r 2 ' ~" (fl 2 + r*)?2 Letting 1 + x 2 = y, so that xdx = dy/2, this is 'dy _ 2ira l 2ir<r J "2"Ji -(-2<r^) = 27TCT. 214 INTRODUCTION TO THEORETICAL PHYSICS Thus the field is a constant, independent of position. Similarly on the other side of the plane it is â€” 2x0-, so that there is a dis- continuity in E of 4x0- in crossing the surface. We have seen that it is possible in such a simple case to compute the field directly. But it is done far more easily by using our general principles. Thus the potential can depend only on the coordinate normal to the plane, which we denote by x. Its differential equation, outside the charged sheet, is then ^ = dx* <j> = ax + 6, showing that the field is constant everywhere, and in the x direc- tion. To investigate conditions on the surface, we set up a thin flat volume, with its broad sides parallel to the charged plane, and enclosing just 1 sq. cm. of this plane. It will then hold charge <x, so that 4xcr lines will diverge from it. By symmetry, these will leave it at right angles, and an equal number over each face. Hence 2x0- will leave over each face, or the field strength is 2x<r on the one side, â€” 2xo- on the other. We have the same result as before, with very much simpler calculation. Similar problems are met in the theory of the condenser. Take, for example, parallel the parallel plate condenser, as in Fig- 36, two charged plates of area A , so large in proportion to their separation d that they can be almost treated as infinite. Let the charge per square centimeter be <r on one plate, â€” o- on the other. Then we must find the potential difference between the plates, for by definition the capacity C = ^j.- But now, just as in the last case, the field must be constant and perpendicular to the plates. It can have different values in the three regions to the left of the plates, between, and to the right. And it has a discontinuity of 4x<r in passing through a plate of surface density <r. These conditions are all satisfied by having no field outside the condenser, and by having a field 4xo- within, pointing from the positive plate to the negative. Thus E=0 Fig < d -*l + -f + + + + + + + + + 4- + E=4airf - -6 E-0 36.â€” Field plate condenser. GREEN'S THEOREM, AND POTENTIAL THEORY 215 the potential difference, being the field times the distance, is brad, so that C = -^ = A (3) the familiar formula for a parallel plate condenser. It should be noticed that capacitance has the dimensions of a length in the electrostatic system of units. 136. Electrostatic Problems with Conductors. â€” The second sort of electrostatic problem is more difficult. It is that in which there are conductors as well as charges. Now in the presence of a charge, induced charges are set up on conductors, and it is usually a difficult problem to find how they are distributed, and hence to find their field. In this case it is practically indis- pensable to make use of the methods of potential theory. To see how to proceed, let us imagine the train of events which would occur when a charge was brought near a conductor. The charge would carry with it a field, which in general would be such that different parts of the conductor were at different potentials. Now a conductor has the peculiarity that if there is a field in it, a current flows, and continues to flow as long as the field remains. Thus charge will start to flow through the conductor, being attracted or repelled by the external charge. This will continue until just such a charge distribution has been set up in the con- ductor that the field resulting from it plus the external charges reduces to zero within the conductor, or the potential throughout the conductor is constant, for this is the condition for no current flow. In other words, the whole of a conductor, surface and inside, is part of a single equipotential. We then solve such a problem in the following way: we look for a solution of Poisson's equation, holding in the region outside the conductors, and reduc- ing to constants on the boundaries. This solution thus gives the potential of the problem, and its gradient gives the field. We can illustrate better by a problem. Consider an infinite conducting plane, uncharged as a whole, with a charge e in front of it at a distance d. Now we wish a solution of Poisson's equa- tion, reducing to a constant over the face of the plane. We set this up by a device, called the method of images. We imagine the plate removed, its face replaced by an imaginary plane, and at a distance d behind the plane we put a charge â€”e, as if it were e's image in a mirror, as shown in Fig. 37. Then these two 216 INTRODUCTION TO THEORETICAL PHYSICS charges together would keep the whole plane just at potential zero. For any point of the plane is equidistant from both charges, one has the potential e/r, and the other â€” (e/r), and they just cancel. The potential at any point of space can be easily found, now, in the field of these charges. It is simply -- l if r x is the distance from the charge e, r 2 the distance from its mirror image. The lines of force and equipotentials look like \ \ \ / I ! >' WÂ£>~' / / / 1 \ \ \ W \ ^ / / / I / l \ / \ v Fig. 37.- -Lines of force for charge e in front of conducting plane, by method of images. those of a bar magnet, and it is perfectly true that the plane bisecting the magnet is an equipotential. In our actual problem, now, the potential in the empty space is just that given by our field of two charges; in the metal the potential is zero. We might naturally inquire what induced distribution of charge would be set up in the conducting plane, to produce this final field. In the first place, in a steady state, the charge within a conductor is always zero. For the field is zero within it, there- fore its divergence is zero. Thus all charge is concentrated on the surface. Next, as we showed before, the normal component GREEN'S THEOREM, AND POTENTIAL THEORY 217 of the electric field has a discontinuity of 4tto- at a surface carrying a surface charge a. Thus if we can compute the discontinuity, we can in turn get the surface density of charge. In our case the field is normal to the plate, by symmetry, so that the discontinuity of E n in crossing the surface is just equal to the total E outside. This may be found at once from our known potential function, so that we could get the necessary surface charge. 137. Green's Theorem. â€” The fundamental theorem of potential theory is a mathematical relation called Green's theorem. It is a result of Gauss's theorem, and is easily proved. Gauss's theorem states that J// div E dv = fJE n dS for any vector E. Now let E = <f> grad ^, where <Â£ and ^ are two scalar functions, then div E = div O grad yp) = 4>VV + grad <f> â€¢ grad ^, as we can easily prove. Also E n = <Â£â€” Â» where â€” is the normal derivative, the component of the gradient along n. Hence we have f f f(*VV + grad 4> â€¢ grad *p)dv = I 1^-^dS. (4) This is one form of Green's theorem. To get the more familiar form, we next write just the same expression with <Â£ and yp - interchanged : f f f (*v"V + grad <Â£ â€¢ grad +)dv = f |V ^ dS. Now we subtract, obtaining J JJW - **â™¦)*- JJ(* f n - *Â£)*?. (5) This is the common form of Green's theorem. We shall now consider a number of applications of this mathematical theorem. These applications come mostly in the discussion of methods of solving Poisson's and Laplace's equations. Of course, these can be solved by the method of separation of variables, and develop- ment in series of orthogonal functions. But the present method, called Green's method, is quite different, and almost more useful in a general discussion, though perhaps not in particular problems. 138. Proof of Solution of Poisson's Equation. â€” We can easily see how to solve Poisson's equation, V 2 = â€” 4?rp. For this gives the potential <f> due to a charge distribution. Now if we 218 INTRODUCTION TO THEORETICAL PHYSICS divide space into small elements of volume dv, the charge pdv will exert a potential pdv/r, if r is the distance from the point where we wish the potential due to dv. Thus the whole potential -US* But p = â€” -t-v 2 $> so that we have 4t if If 5 ?* Â«> d& giving the solution of Poisson's equation. In this integral, we must integrate over all space, so as to include all charges. We have derived our solution rather intuitively from the known solution for a point charge. But we can derive it rigorously from Green's theorem. In the last form of Green's theorem, let \p = 1/r, where r is the distance from a point P, and let </> be the potential <j>. Thus we have This is true no matter what volume we use. Let us choose as our volume the whole of space, except for a tiny sphere of radius R surrounding the point P where we wish to compute the poten- tial. Now v 2 (lA)" !?= 0> except where r = 0, so that it is zero throughout the whole of our volume, and the left side becomes â€” I I I dv. Let us compute the right side. The integral is to be taken over the surface of our volume, which consists of our tiny sphere, and a surface at infinity, which for the present we neglect. Over the surface of the tiny sphere, the direction n is simply the radial direction, pointing in toward P (because it is directed out of the volume). We have d(l/r) = d(l/r) = 1 H = _H dn dr r 2 dn dr Then the right side is But on the surface of the sphere, r â€” R, so that this is GREEN'S THEOREM, AND POTENTIAL THEORY 219 Now â– ffjp is J ust the mean value * of * over the surface Â» and JJ dr ffdS is the mean value of -^- But the I I dS is the area of the sphere = 4ttR 2 , so that our integral is 4tt<Â£ + ^ R ^> and the whole relation is, changing sign, M r or If now R approaches zero, the last term vanishes, and <Â£ approaches <Â£, the value at the point P. Hence we have the solution of Poisson's equation which we wished to prove. There are several points to be mentioned in connection with this proof. In the first place, the volume integral is taken over all space, except an infinitely small sphere surrounding P : a point charge exerts an effect on all other charges, but not on itself. Secondly, we neglected entirely the fact that our volume has a surface at infinity, which we should take into account in calculating our surface integrals. Suppose that the volume were not really infinite, but merely very large, being bounded, say, by a second large sphere of radius R' '. Then the surface integral over the large sphere is similar to that over the small one, but with opposite sign: it is Airfi + AnR'-^) where now 4>' is the mean over the large sphere, etc. To neglect these terms, as we have done, their limits must be zero as R' becomes infinite. Q if That is, $' must go to zero at infinite distance, and R'-^i must also go to zero. These are both satisfied if <t> is the potential of a set of charges at finite points, for then <j> will go as 1/r, 6<p/dr will go as 1/r 2 , and r d<f>/dr will fall off as 1/r, becoming zero as r becomes infinite, or <f> = 220 INTRODUCTION TO THEORETICAL PHYSICS 139. Solution of Poisson's Equation in a Finite Region. â€” Suppose now that instead of extending our integral over all space, we integrate only over a finite volume V, with surface S, excluding in each case our infinitesimal sphere of radius R. Then plainly we have where the volume integral is taken over the whole volume V, excluding the infinitesimal sphere, and the surface integral is taken over S. We can explain this important formula in words much better than by mathematics. The potential at a given point can be written as the sum of two parts : the potential of all the charges within a certain finite volume surrounding the point, and another part, which, of course, must represent the potential of the other charges outside our volume. But the second term appears as a. surface integral, not a volume integral. This is an example of the usual sort of application of Green's theorem: the replace- ment of a volume integral by a surface integral. There is one interesting way of regarding the solution. Sup- pose first that p were zero all through our volume, though not outside. Then vV will be zero inside, and the volume integral will vanish. Further, will satisfy Laplace's equation within the region. The surface integral, in other words, represents a solution of Laplace's equation within our region, in terms of an integral' over the boundary of the region. As a matter of fact, any solution of Laplace's equation in this region can be written in this way, by using the proper boundary values of <j> and d<f>/dn at the surface. The last two terms in our solution, in other words, represent a general solution of the homogeneous equation V 2 ^ = 0, the arbitrary functions (which with partial differential equations replace the arbitrary constants) being the boundary values of <Â£ and d<f>/dn. The volume integral, on the other hand, represents a particular solution of the inhomogeneous equation VV = â€” 47rp, satisfying the equation but not its boundary values. Thus we have the familiar case in which the solution of an inhomogeneous equation is the sum of a particular solution, GREEN'S THEOREM, AND POTENTIAL THEORY 221 and the general solution of fae related homogeneous equation. And this general solution is to be so chosen that the sum of both terms satisfies the boundary values, on the surface of the volume. 140. Green's Distribution. â€” When we examine the surface integral of Eq. (7) more in detail, we can see what it represents. The term -i f 1-^-dS represents evidently the potential arising J J r d n 1 r)ih from a certain surface charge, of surface density ^ â€” â€¢ The other term, -r_( f ^j, dS, is a little complicated. The term ' â– is the difference between the potential of two unit dn charges, spaced at a distance dn along the normal, divided by dn; that is, it is the potential of two charges, one of strength 1/dn, the other â€” 1/dn, at distance dn, as in Fig. 38. Such a combination of an equal and opposite positive and nega- tive charge very close together is called a dipole. The strength of a dipole, or r j yÂ£+ Ar the dipole moment, is the strength of one of the charges times the distance of separation. Thus in our case the strength is (l/dn)dn, so that we have the potential of a unit dipole. Then â€ž**â€¢!* â€¢* ^ . , - j. ! Fig. 38. â€” Potential of unit the integral is the potential of a dipole dipole, consisting of charges distribution of moment <Â£/4ir per unit + JL at distance dn apart. area. Such a distribution is called a dn double layer, since it consists of layers of positive and negative charges close together. We then see that by spreading on the surface of our region a suitable layer of surface charge, and a double layer of dipoles, we produce just the same field inside that the external charges would give. This distribution of charge and double layer is called Green's distribution. Suppose that we know that a given function <Â£ satisfies Lap- lace's equation within a given region. Suppose further that we know its boundary value <Â£, and its normal derivative d<j>/dn, at all points of the surface of the region. Then we can at once write the solution of Laplace's equation having these boundary values. It is 222 INTRODUCTION TO THEORETICAL PHYSICS --=//(â™¦ ^ } -^)Â«. integrated over the boundary. This is obviously a very simple way of getting a solution of a differential equation satisfying given boundary values. In particular it is simpler than the methods we have used so far, in that we can apply it to any form of surface. There is a simple interpretation of Green's distribution. Suppose that the field within our volume were just what it is, but that outside the volume the field and potential were every- where zero. Then at the boundary there would be a discon- tinuity of potential and field. Now we have already seen that at a surface charge o- there is a discontinuity of field, ixa, so that at a discontinuity of the field there is a surface charge equal to 1/4tt times the discontinuity of the normal component of the field. Thus if the normal component of the field is zero outside, d(j>/dn inside, the surface charge is l/4jr d<j>/dn. This is just the surface charge concerned in Green's distribution. Similarly, at a boundary where there is a discontinuity of poten- tial, there must be a double layer, of moment per unit area equal to 1/4t times the discontinuity of the potential, as we see from a condenser of charge <r, dipole moment ad per unit area, potential difference 4tiW. This gives the double layer of Green's distribution. In other words, these surface charges and layers, plus the charges within the region, are just those necessary to give the potential its actual values within the volume, and to reduce it to zero outside. 141. Green's Method of Solving Differential Equations. â€” We have seen in the present chapter a method, called Green's method, for solving differential equations, quite different from any we have met before, except the integral method of treating heat flow, which is very similar. The most characteristic part of the method is in the solution of Poisson's equation, as an integral of p/r over all space. Here we had an inhomogeneous equation, v 2 <Â£ = â€” 4xp. Suppose we let p = p x -f- p 2 -f- p 3 â€¢ â€¢ â€¢ where p; is equal to p in the ith. volume element dv i} but is zero elsewhere. Then we can write the equations v 2 <Â£i = â€” 4irpi y-20,, â€” _ 4 T p 2j . . . , for each of these, where pi is different from zero only in a very small region, so that the problem is practically that of a point charge, which we can solve. We add these func- GREEN'S THEOREM, AND POTENTIAL THEORY 223 tions to get the whole solution, according to Sec. 26, Chap. IV. This is the essence of Green's method, the separation of the inhomogeneous part of the equation into simple parts, each of which we can solve. The function 1/r, which is the solution for one of these problems, is called the Green's function. As a matter of fact, a general method of solving differential equations by means of Green's functions has been worked out, and it lies at the basis of much of the more advanced work on the theory of differential equations, particularly of the second order. Problems 1. Given a spherical distribution of charge, in which the density is a function of r. Prove that the field at any point is what would be obtained by imagining a sphere drawn through the point, with its center at the origin, all the charge within the sphere concentrated at the center, and all the charge outside removed. Apply to gravitation, showing that the earth acts on bodies at its surface as if its mass were concentrated at the center. 2. Given a sphere filled with charge of constant density. Prove that at points within the sphere, the field is directly proportional to the distance from the center. 3. A condenser consists of two concentric spheres, holding equal and opposite charges. Find its capacity. Similarly find the capacity of a condenser consisting of two long concentric circular cylinders. 4. Compute the surface density induced by a charge on a plane conductor. 5. In a certain spherical distribution of charge, the potential is given by gâ€” ar â€¢ . Find the charge density as a function of r. Also find the charge contained between r and r + dr. This represents roughly the charge distribution within an atom. 6. Prove div (4> grad \f/) = </>vV + grad <t> â– grad \p. 7. There are certain charges and conductors in an electrostatic field, whose potential is $. Show that the surface density of charge on the surface of a conductor is -tâ€” â€” , where n is the normal pointing out of the conductor. Show that the electric field is normal to the surface of a conductor. 8. It requires several volts energy to remove an electron from the interior of a metal to the region outside. Find how many volts, if the double layer at the surface consists of two parallel sheets of charge, a sheet of negative electricity, of density as if there were electrons of charge 4.77 X 10 -10 e.s.u., spread out uniformly with a density of one to a square 4 X 10 -8 cm. on a side, and inside that at a distance of 0.5 X 10 -8 cm. a similar sheet of posi- tive charges. Remember that 300 volts = 1 e.s.u. of potential. 9. Discuss the potential and field of a dipole. 10. An uncharged metallic sphere of radius R is placed in a homogeneous electric field of intensity E a . Calculate the potential at any point of space, and sketch the equipotential curves. (Hint: Solve Laplace's equation in 224 INTRODUCTION TO THEORETICAL PHYSICS polar coordinates taking the z axis as the direction of E . Note that there is symmetry about the z axis. Try a solution of the form 4> = Fxir) + F 2 (r) cos with the conditions that Fi{r) â€” >0, as r â€” > <Â» i'V? 1 ) â€” > Â£V, as r â€” > oo and that <t> must be constant all over the sphere of radius R.) Solve the problem for the case that the sphere carries a total charge e. 11. The equipotentials due to two point charges e and e' are given by e/r + e'/r' = C. Show that the surface becomes spherical if e is of opposite sign to e' and C = 0. Consider a spherical conductor coinciding with this surface which is grounded. This does not disturb the field, so that these charges give the field we would have if one of the charges were removed and the metallic sphere left there. Show that if a is the radius of the sphere and L the distance from the charge (outside the sphere) to the center of the sphere, the image charge inside the sphere lies a distance L' from the center such that a 2 = LL' and has a charge e' = (â€”ea/L). Show that the surface density of induced charge varies inversely as the cube of the distance from the charge outside the surface to the point of the surface under consideration. CHAPTER XX MAGNETIC FIELDS, STOKES'S THEOREM, AND VECTOR POTENTIAL The static magnetic field resembles the electrostatic field in many ways. The intensity of the field due to a magnetic pole is equal to the pole strength divided by the square of the distance of the point at which the intensity is measured, so that magnetic poles display close analogy to electric charges. The intensity of this field H is defined as the force per unit magnetic pole, and this is measured in the system of units known as the electro- magnetic, as distinct from the electrostatic. We shall discuss the relation between these systems of units in a later section. The vector H satisfies the equation div H = 47r X density of magnetic poles, but here a very important difference appears; north and south magnetic poles never can exist alone. No matter how small one takes a volume element, the north and south poles just cancel, so that the total density of magnetic poles is zero. Hence we have div H = 0. (1) Thus we must always deal with at least a pair of opposite poles, and here we always have a magnetic dipole, whose behavior is just like that of an electric dipole. The magnetic moment of a bar magnet is defined as the product of the strength of one of the poles times the distance of separation, and magnetic fields are measured by measuring the torque exerted on a suspended mag- net (magnetometer). Exactly as we have defined the electromo- E -ds, we can now define as the magnetomotive force J B H â€¢ ds. This is the work per unit pole done by the magnetic field as the pole is moved along a path from A and B. There is also a magnetic potential $ = â€”JH-ds, and in the field of permanent magnets JH â€¢ ds taken around any closed path is zero. 225 226 INTRODUCTION TO THEORETICAL PHYSICS 142. The Magnetic Field of Currents. â€” It is when we come to consider the magnetic fields due to currents that we meet differ- ences from the electrostatic case. Suppose that we have a straight wire in which a steady current flows. The magnetic lines of force are concentric circles around the wire and it is clear that if we calculate the integral JH â€¢ ds following one of these circles, we shall not find that its value is zero for such a closed path. On the other hand if we evaluate JH â€¢ ds around any closed path which does not encircle the wire, it does vanish, and the situation is then analogous to the electrostatic case. These considerations hold for any closed circuit carrying a current. We can reduce our problem to an ordinary magnetostatic one by the following device : suppose that we construct a surface bounded by the wire carrying the current and do not allow any of the curves along which we calculate JH â€¢ ds to cut through this surface. Then no closed paths are possible which encircle the current, JH â€¢ ds = around every Fig. 39.â€” Magnetic shell and path, and everywhere in space there multiple valued potential. The . ,. , ,. T _ o, potential difference between a 1S a magnetic potential $. Suppose and b is 47rmo, or Airi, where m we evaluate JH â€¢ ds along a Curve is the strength of the double , , . , . T , ,, layer producing the same mag- starting at a on one side of the sur- netic field as the current i in the face and following a line of force wire encirc mg e s e . around to a point b on the other side of the surface, as in Fig. 39. The difference of magnetic potential between a and 6 is given by 3> a â€” $ 6 = â€” I H â€¢ ds = â€” I -r-ds. Ja Ja ds and the potential difference does not approach zero as we let a approach 6 since then the curve would cut our surface. This must mean that there is a jump in potential as we cross the sur- face. We have already seen in the last chapter that a surface distribution of dipoles (a double layer) produces a discontinuity in potential, so that we can replace our current by a surface layer of magnetic dipoles on a surface whose boundary is the current-carrying wire, and produce exactly the same magnetic field as the current. Suppose that we have a surface of area A on which we have a dipole layer of constant moment m STOKES'S THEOREM, AND VECTOR POTENTIAL 227 per unit area. (This may be either an electric or magnetic dipole layer). Consider a point P outside the surface. If one looks from P to the surface, the surface subtends a solid angle 0. It is easy to show that the potential at P is equal to m Q times ft. The proof of this is left to a problem. In particular if P approaches the surface, 12 approaches 27r so that the potential at a point just one side of the surface is 2irm . Similarly on the other side of the surface the potential is â€” 2xm , so that there is a discontinuity of potential equal to 4rm as one crosses the double layer. Thus in our case we have $ a â€” $& = +4xm (the + depends on which way we go around the curve ah), so that fH â€¢ ds = Â±4rm around a closed curve which cuts through the double layer sur- face and is zero for every other closed curve. In the following we shall always go around the curve in such a direction that JH â€¢ ds = 4rm . If we now ask how m depends on the current, we must get the answer from experiment and the relation turns out to be exceed- ingly simple; the magnetic moment per unit area m is propor- tional to the current. If we have not as yet defined the unit of current we may place m = i, and this equation defines the unit current in the electromagnetic system of units. Thus fH-ds = 47rt (2) where the integration is carried once around a path encircling the wire. If we go around again the value of the integral increases again by 4x2, and so on for every complete circuit of the path. This unit of current which we have introduced is called the abampere and is ten times as large as the practical unit, the ampere. On the other hand, we might wish to utilize the electrostatic unit of current, defined as the current in which one electrostatic unit of charge passes a given point per second. It is necessary to determine experimentally the proportionality constant between ra and i. This has been done and turns out to be 1/c, where c = 3 X 10 10 cm. per second. If we express our current in electrostatic measure, the work done in carrying 228 INTRODUCTION TO THEORETICAL PHYSICS a unit north pole around a circuit enclosing the current is J H-ds = â€” â€¢ (3) The e.s.u. of current is }4 X 10~ 9 ampere. 143. Field of a Straight Wire. â€” We can illustrate these ideas easily in the case of a straight wire carrying a constant current. Since the lines of force are circles, let us calculate the work done in carrying a unit pole around such a circle of radius r. In this case H has the same value all along the circle and is tangent to it. Thus Â£H-ds = Hjds = 2*rH = 4m so that the magnetic field intensity at a distance r from a straight wire carrying a current i is H = -â€¢ (4) r We can now set up the potential for this case. Thus, let the wire be along the z axis, as in Fig. 40, so that the field is given by Fig. 40. â€” Magnetic lines of â€ž _ ~2iy â€ž _ 2ix rr _ n iorce (circles) and equipoten- tLx â€” ' 2 ' "v ~ ~~^' n z â€” \). tials (radii) for the field of a wire carrying a current (at Th y H = 0, as we can im- right angles to the paper). A ^ ' H is perpendicular to radius, mediately prove by substitution. Therefore #*: H y =-y:x. rj^^ for examp le, CUT \ Z H = d(2ix/r*) _ d(-2iy/r*) = 2t _ W 2t _ 4^ = dx dy r 2 r 4 ^ r 2 r 4 can have a potential, and it is easy to see that it must have as its equipotentials the lines 6 = constant, where 6 is the polar angle in the xy plane, since these are at right angles to the lines of force. If we set $ = â€” ~2id = â€” 2i tan -1 (y/x), we have -d$/dx = -2iy/r 2 = H x , -d<f>/dy = 2ix/r 2 = H y , so that we have actually exhibited the potential. But now we see that the potential is not single-valued. For a given value of x and y, the angle tan -1 (y/x) can have an infinite number of values, differing by 2x, and the potential can have an infinite number of values differing by 4xi, in agreement with what we found before. Thus the potential is not defined ir STOKES'S THEOREM, AND VECTOR POTENTIAL 229 as simple and definite a way as in electrostatics. The interpreta- tion of this situation comes from a theorem called Stokes's theorem. 144. Stokes's Theorem. â€” Stokes's theorem states that if we have any closed curve, and integrate the tangential component of a vector around it, the result is equal to what we obtain if we take some surface bounded by the curve, and integrate the normal component of the curl of F over this surface : JF, ds = / JcurL F dS. (5) To prove it, we first divide up the surface into small surface elements, of area dS. For one of these the surface integral is CUrl n F dS. NOW Suppose We Choose x ,y+ dy â€ž x+dx , y+dy the axes so that the z axis is normal to dS, and the area dS is bounded by x, y, x + dx, and y + dy as in Fig. 41. Then the surface integral is \dx - j dxdy. Let us next compute d y J â€” .â€¢/â€¢ â€” ' â€” "â€” w~+t>~Â»~ xy - x+dx,y jF s ds for the element of area. It is FlG - 41.â€” Circuit for proving evidently F x {x,y)dx -\- F y {x-\-dx,y)dy - F x (x,y + dy)dx - F y {x,y)dy = f -^ - â€”Jdxdy, if we go around so as always to keep the surface on the left. Thus the theorem is true for such an infinitesimal surface. But now, if we put the whole surface together out of its elements, the total surface integral will be the sum of the parts, or Jjcurl n F dS. Also the total line integral will be the sum of the integrals over the separate elements. To see this, we note that in making the sum, all boundaries except the outside edge of the area are shared by two elements of the area, and the line integral from one traverses the boundary in one direction, from the other in the opposite direction, so that the contributions all cancel, leaving only the integral over the outer boundary, which is then fF s ds. Thus Stokes's theorem is proved. 145. The Curl in Curvilinear Coordinates. â€” It is often useful to have the curl, and Stokes's theorem, in curvilinear coordinates. We refer back to Chap. XVIII, using methods analogous to those used there in discussing the divergence and gradient. Consider an approximately rectangular area, similar to that in Fig. 38, 230 INTRODUCTION TO THEORETICAL PHYSICS bounded by q h q\ + dq x , q 2 , q 2 + dq 2 . The line integral about the circuit is Fi(q 1} q 2 )dsi + F 2 (qi + dq lf q 2 )ds 2 â€” Fi(q\, q 2 + dqi)dsi â€” Ftiqi, q 2 )ds 2 F 2 (qi + dq h q 2 ) ^sfai, jfOL- .j"Fi(gi , g 2 + dq 2 ) dq\ â€” â€” j tÂ«/2 râ€” /i 2 ai 2 I I hi Fi(q h qi) hx Since this must be curl 3 F dsids 2 , we have â€” â€¢' -**[4&) - Â£(Â£)} (6) with analogous relations for the two other components. We can illustrate the formulas by showing that the curl of the field of a straight wire is zero. Let us take cylindrical coordi- nates, in which r = q u 6 = q 2 , z = gs, /k = 1, h 2 = 1/r, h s = 1. The assumed magnetic field, along the tangent, is H r = 0, H e = 2i/r, H z = 0. We then have H d /h e = 2i, a constant, so that its derivative is zero, and the curl vanishes. 146. Applications of Stokes's Theorem. â€” Let us apply Stokes's theorem in a few cases. First, if the curl is everywhere zero, the line integral of the vector is zero around a closed path. It follows that the line integral from one point to another along any path is the same. This is the condition for the existence of a potential, and we now see that the vanishing of the curl is just the condition that we must have in order to set up a potential. But in the magnetic case, it is not true that the line integral around any path is zero. Any contour including the current has an integral different from zero. The whole situation is then explained if inside the wire carrying the current the curl of H is not zero, but is a vector pointing along the direction of the current, of such a magnitude that the total surface integral over the cross section of the wire is 4W. Thus, for example, a contour going once around the current has a surface integral of the curl equal to 4ri, which therefore must be the value of the line integral of the tangential component of H. To find the exact relation between the current and curl H, we imagine the current in the wire to be spread out through the actual material of the wire, as in fact it is. We set up u, the STOKES'S THEOREM, AND VECTOR POTENTIAL 231 current density, or flux of electricity, satisfying the equation of continuity dp/dt + div u = 0. Then i = jju n dS, where the integration is over the cross section of the wire. We must have, then, 4x jju n dS = J/curL H dS, and since this must hold for any size wire, the natural assumption is that the same relation holds between the small elements of current, so that 4ru n = c\irl n H, or more generally curl H = 4tm. (7) Here u is in e.m.u. If it is in e.s.u., the equation is curl H = 4nru/c. We can see one result of these equations. If the current instead of being in a single wire, is distributed through space, the curl is different from zero everywhere, and there is no possi- bility of writing a potential at all. 147. Example: Magnetic Field in a Solenoid. â€” Suppose we have an infinite solenoid, of finite radius, with n turns per centi- meter, carrying current i, and that we wish to calculate the magnetic field inside it. We assume that it is in no external magnetic field, so that the field outside is zero. By symmetry, the field inside will point in the direction of the axis. Now let us apply Stokes's theorem to a path as follows: (1) Inside, along a line parallel to the axis, for 1 cm. The integral of H will be H i} the H inside, times unit distance. (2) Straight out, radially, to the outside of the solenoid. Since H is at right angles, the integral of H will be zero. (3) Outside, back for 1 cm. along a line parallel to the axis. The integral is zero since H is zero outside. (4) Straight in again, closing .the figure, and contribut- ing nothing to the integral. Thus we have jH s ds = H { . Now J/curU H dS = 4rfjuÂ» dS = 4r X total current flowing through the contour = 47rra. Hence we have H t = hrni, the formula for the magnetic field inside a solenoid, showing that it is constant independent of position. 148. The Vector Potential. â€” In magnetic fields coming from permanent magnets, where there is no current, we can write an ordinary potential letting H = â€” grad <E>. But this is only possible when curl H = 0, which is not true in the presence of currents. On the other hand, it can be shown that if the diver- gence of a vector is zero, as div H = 0, it is always possible to set up a vector A, called the vector potential (to distinguish it from 4>, which is called a scalar potential), such that H = curl A. This is often a useful thing to do. We can prove readily that div curl A = always, so that we have div H = 0. 232 INTRODUCTION TO THEORETICAL PHYSICS The vector potential satisfies a simple differential equation. We know that curl A = H, but this does not determine A uniquely. In fact, to determine a vector field uniquely we must specify both its curl and its divergence, and we can find a vector whose curl and divergence are any desired functions. Let us then demand that div A = 0. We now have curl H = 4iru/c = curl curl A. It can be proved that curl curl A = grad div A â€” V 2 A = â€” v 2 i, since div A = 0. Hence V'A = -^ (8) c similar to Poisson's equation for the scalar potential in terms of the charge density, V 2 <Â£ = â€” 47T/3. These two equations, expanded to include terms depending on time, prove to be very important in general electrical theory. Let us set up the vector potential for a current in a straight wire. Take cylindrical coordinates, with the wire pointing along the z axis. Poisson's equation for A is a vector equation, but since u has only a z component, A will likewise have only a z component, which will depend only on r. Thus we have 1 d ( dA A 4xw , ^ n â€”A r-r- J = â€” â– â€” - forr < R, r dr\ dr / c = for r > R. where R is the radius of the wire. The solutions of this equation are A z = + a In r + b for r < R c = d In r + e for r > R. Since A cannot become infinite at r = 0, we must have a = 0. We may choose 6 = 0. Then d and e must be chosen to make A and its derivative with respect to r continuous at r = 22. Noting that ttR 2 u = i, the total current, this easily leads to 2i A * = In r + constant for r > R. c The only component of H is then H e , which is dr\ h z / J cr STOKES'S THEOREM, AND VECTOR POTENTIAL 233 149. The Biot-Savart Law. â€” In the case of a linear conductor carrying a current i, the expression for the vector potential, using the solution of Poisson's equation from Chap. XIX, becomes a i i ds A = - I â€” ; c.l r where ds is the vector element of length taken along the con- ductor, and pointing in the direction of the current. To find the intensity of the magnetic field, we take the curl, finding H = curl A i C i ds â€” - I curl â€” cj r In this equation, ds is a vector and r a scalar. In general, if S is a scalar and B an arbitrary vector, it is easy to show that curl (SB) = S curl B + (grad S) X B. Applying this relation to our case, B = ds, and S = 1/r, and we must remember that in taking the curl we differentiate only with respect to the coordinates which fix the point at which we wish the value of H (the field point). Now these coordinates appear only in r and. not in ds, which depends on the circuit only. Thus the first term vanishes and we have â€” > where r is the vector from ds to the field point, and r is the length of this vector. If we imagine that the resultant H v is made up of a sum of contributions from each conductor element ds, we may write the law in its differential form dH = ^(dsX^y (10) This is known as the Biot-Savart law. The magnitude of dH is obviously \ dH \ = Â£& sin d, (11) where 6 is the angle between the direction of ds and r; the direc- tion of dH is perpendicular to the plane of ds and r. Applied to closed circuits it always yields the same results as the integral law. For open circuits this is not obvious, since we can add to the expressions for dH a differential d\f/ provided fdip around a 234 INTRODUCTION TO THEORETICAL PHYSICS closed curve is zero. In this way we leave the law for closed circuits unaltered, but for open circuits change the value of H so calculated. Thus the integral law must be looked upon as the more fundamental. Problems 1. Prove that a double layer of moment m per unit area leads to a poten- tial <f> at point P equal to m fl, where Q, is the solid angle subtended by the area from the point P. 2. Show that in the electrostatic system of units, charge has the dimension m^l^t- 1 , current the dimensions ra^V 2 , voltage (e.m.f.) the dimensions mfiifit-\ resistance the dimensions l~H, and capacity the dimensions I. 3. Derive the dimensions of charge, current, voltage, resistance, and capacity in the electromagnetic system of units. 4. Prove that if S is a scalar and B a vector curl (SB) = S curl B + grad S X B. 5. Prove div curl F = 0; curl curl F = grad div F - v 2 F, where F is any vector. 6. Using the Biot-Savart law, find the magnetic field at any point on the axis of symmetry of a circular loop of wire of radius R carrying a current i. 7. A current flows in a circular loop of wire, of radius R. Find the vector potential of the resulting magnetic field, at large distances compared with R, by adding the contributions to the vector potential due to the separate elements of current. 8. Compute the field, from the potential of the last problem, and show that it is approximately the field of a single dipole. Find the strength of the dipole, in terms of current and radius R. 9. Two parallel straight wires carry equal currents. Work out the magnetic fields due to the two together, in the two cases where the currents flow in the same or in opposite directions, drawing diagrams of the lines of force. 10. Find the magnetic field at points inside a wire carrying a current, assuming the wire is straight and of circular cross section and that the current has constant density throughout the wire. 11. Compute the curl in spherical polar coordinates. Verify directly that the divergence of a curl is zero in these coordinates. CHAPTER XXI ELECTROMAGNETIC INDUCTION AND MAXWELL'S EQUATIONS We now leave the restriction of the steady state and inquire into the extensions of the theory necessary to have it hold for nonstationary phenomena. The fundamental fact concerning electromagnetic induction may be stated as follows: If a set of circuits carrying current (or magnets and circuits) are set in relative motion with respect to each other, the currents in the circuits change during the relative motion. Instead of formulat- ing a law for the induced currents, it is simpler to consider the induced electromotive force. Take a closed circuit in the neigh- borhood of a moving magnet (or moving circuit), and let N be the number of magnetic lines of force through the circuit. Then dN the induced electromotive force is â€” -=r> expressed in electro- magnetic units, if N is in these units. If the e.m.f . is expressed in electrostatic units it is equal to -rr. The minus sign expresses what is commonly termed Lenz's law and indicates that dN if -rr is represented by a vector going through the circuit, the induced current flows in a clockwise fashion. 150. The Differential Equation for Electromagnetic Induction. We can now state this law in more analytical form. Consider the closed curve formed by the circuit, and any surface whose boundary is this curve, so that the surface forms a sort of cap over the curve. Then the magnetic flux N = IJH n dS where the integral is carried out over the whole surface. Further- more the electromotive force is by definition the work done in carrying a unit charge once around the circuit. This work may be done either by the electric field or by chemical forces in a battery. Since the latter are considered absent we have e.m.f. = (pEtds 235 236 INTRODUCTION TO THEORETICAL PHYSICS where the integral is taken completely around the circuit. The fact that this line integral does not vanish shows us at once that we shall not be able to introduce a potential, as we have done in the electrostatic case. Thus we have $E s ds = -j t f CH n dS. (1) dS. It should be noticed that the flux of the magnetic field through the circuit may change in several ways, either by changing H n , or by changing the shape of the circuit, thus causing a change in the enclosed area, or by moving the undeformed circuit to other parts of space where H n is different. In general dN/dt is com- posed of several terms. In the case of fixed circuits, we may replace the total time derivative by the partial derivative so that dN/dt = dN/dt. With the help of Stokes's theorem we rewrite the induction law as This holds for any fixed circuit, and hence for any fixed area of integration. Thus it must hold for an infinitesimal area dS, so that the integrands must be equal and we obtain curl E = â€”-qj:' This is the differential form of the induction law. In it, E and H are both expressed in e.m.u. If E is expressed in e.s.u. and H in e.m.u., the law takes the form curl E = -i d i- (2) c at 151. The Displacement Current.â€” We have now derived four fundamental electromagnetic equations: div E = 4ttp, div H = 0, 1 Â°H curl E = -- -%> c where E, p, and u are in e.s.u. and H in e.m.u. These aie aimost the Maxwell equations, but there is difficulty with the last of ELECTROMAGNETIC INDUCTION 237 them. Of course, we have derived it on the basis of steady closed currents and for this case it is surely correct. The difficulty occurs when we try to apply this result to nonstationary cases. In the nonsteady state we have the new possibility of current flowing in "open" circuits. The simplest example is that of the discharge of a condenser. Here the current starts at the posi- tively charged plate, whose charge diminishes as the current flows to the negatively charged plate and annuls the charge there. Thus we can look upon the condenser plate as a source (or sink) of current. Now if we take the divergence of the last equation, we have 4:7T div curl H â€” â€” diy u c and since the divergence of any curl is zero, we find that div u equals zero, which means that the current is always closed and there are no sources or sinks. Thus open circuits lead to a Contradiction to this equation. We have derived the equation from steady-state considerations, however, and if we are to extend it to hold under all conditions, it is clear that there must be some term which vanishes for the steady state which we must add. The equation of continuity applied to electric charge and current tells us that div u + |? = ol expressing the fact that the flow of current out of a volume results in a decrease of charge in that volume. In the steady state dp/dt = 0, so that div u = 0, and we have no inconsistency with our fundamental equation. It is certainly clear that if curl H is to be proportional to a current, this current must be divergenceless, and u is not. Maxwell made the bold step of assuming that the whole current consisted of two terms u and u' , where u' was so chosen that div {u + u') = 0. In this way the distinction between open and closed circuits vanishes and a unity hitherto lacking was given to the laws. Maxwell saw at once 1 r)W that we must set u' = -. â€” â€¢ For then we have Air dt div (u + u') = div [u -\- -. â€” \ = div u + -r- â€” (div E) \ 47T Ol I 4x at = div u + â€” = Ol 238 INTRODUCTION TO THEORETICAL PHYSICS and this is the equation of continuity which we have been trying to satisfy. In other words, Maxwell assumed the correct equa- tion to be critf-iaf + *r tt . (4) c dt c 1 riF 1 The new term -: â€” is called the displacement current, in con- 4t or trast to the convection current u. Actually the real advance of Maxwell over his predecessors lies in the introduction of this displacement current. The physi- cal meaning of this current can be obtained by considering the charging of a condenser. Current flows from one plate through the wire to the other plate. If the current is i, this equals the rate of increase of charge on the plate. Suppose the plates are of area A, separation d, then the field between them is 4x E = iira = -j- X total charge' and the displacement current density in the region between the plates is ^^ = a? = i^ (totalcharge) = r A rtW Thus the displacement current is j â€” -^ â€” i, and is equal to the convection current in the wire, so that the current becomes continuous throughout the circuit. The fundamental assump- tion of Maxwell was that the displacement current is always present when an electric field varies in time and produces the same magnetic effects as convection currents. It is clear that a test of Maxwell's hypothesis can only be made 1 f)Tf with very rapidly varying fields, since we must make j â€” -rr > > u in order to keep the convection current effects from masking the displacement current effects. As is well known, Hertz, in 1888, performed the experiments on electric waves which con- firmed this assumption of Maxwell. There is an interesting connection between the displacement current and the Biot- Savart law. All the attempts before Maxwell were to find a correct form of the Biot-Savart law for "open" circuits. As we pointed out in the last chapter, the addition of a total differential to this law would yield nothing when it was applied to closed ELECTROMAGNETIC INDUCTION 239 circuits, and the hope was that the correct form to be added to this law could be found so as to account for open circuit phenomena. 152. Maxwell's Equations. â€” We can now write the correct Maxwell equations 1 dE . 4xw , ^ 1 dH curl H = curl E = rr- c dl ^ c c ** div H = div E = 4xp These are the fundamental equations of electromagnetic theory. They need extension in but one way. If there are dielectric and magnetic bodies present, in them Coulomb's law and its analogue for the magnetic-field become * ~ â€” v and F = mm /xr 2 where e is the dielectric constant and m the magnetic permeability. We now introduce a new vector called the electric displacement D, defined by D = eE, where E is the intensity of the electric field. Similarly, we introduce the magnetic induction vector B = nH. It is easy to see from our previous work that we now have the relation div D = 4rrp. Furthermore, Faraday's induc- tion law refers to the rate of change of magnetic flux through a circuit and hence H must be replaced by B in this relation. 1 /) Finally, we have div curl E = = â€” â€” div 5, so that div B = 0, rather than div H = 0. The final equations are thus found to-be: , rr 1 d'D , 4aru ' , e, .1 dB curl H = - -37 H curl E = â€” -^ C dt c C at div B = div D = 4ttp B = nH D = eE. (5) In these equations, E, D, p, u are in electrostatic units, H and B in electromagnetic units. In Chap. XXIV we discuss in detail the significance of B and D, and the interpretation of e and /*â€¢ Maxwell's equations sufl&ce to determine the field, when we are given the charges and currents. To make a complete set of dynamical principles, however, we need two more relations. 240 INTRODUCTION TO THEORETICAL PHYSICS First is the formula giving the force acting on a charge and current. The electrical force per unit volume is simply pE, the force on unit charge multiplied by the charge per unit volume. The magnetic force is that acting on the current, as observed in the ordinary action of the electric motor. This force acts at right angles both to the current and to the magnetic field, and is proportional, as is shown in the elementary study of electricity, to the current (in electromagnetic units) times the component of magnetic field at right angles to the current. For unit volume, this is just given by the vector product u X H. If u is in elec- 11 trostatic units, it is - X H. Thus we have for the force vector c F = P E + -{u X H). c If the current density is produced by the motion of charge, we have u = pv, where v is the velocity vector of the charge. In this case F --â€¢ p \e + Â±(vXH)\ This relation has been particularly used by Lorentz. Finally, one must have a law, such as Newton's law stating that the force is equal to mass times acceleration, determining the motion of charge in terms of the force acting. With such a law, we find the field from the charge, the force from the field, and the motion from the force, obtaining therefore a complete system of dynamics. Let us now summarize the various steps gone through in build- ing up Maxwell's equations. Consider first the static case. Here dD/dt = dB/dt = and u = 0. The equations become curl H = curl E = div B = div D = 4tnp B = pM D = eE. The three equations on the left are those of magnetostatics, and the remaining three are those of electrostatics. Each system is completely independent of the other. The equations curl H = 0, and curl E = 0, show that scalar potentials exist. In the stationary case, we still have dB/dt = dD/dt = 0, but now u 5* 0. The only one of the equations above which is modified is curl H = 4iru/c } the others remaining unchanged. ELECTROMAGNETIC INDUCTION 241 It is usual to include Ohm's law in the statement of the equations, however. This law is easily stated in differential form by considering a small volume, having length L in the direction of the current flow, and cross-sectional area A normal to the current. We apply Ohm's law in the form p.d. = iR. Here the potential difference is the field E times the length L of the volume, the current is the area times the current density u, and the resistance is the specific resistance times L/A. Hence we have EL = Au-j X specific resistance, or u = <rE, (6) where <r, the specific conductivity, is the reciprocal of the specific resistance. This equation is Ohm's law in the form suitable for Maxwell's equations, and it is commonly included along with D = eE and B = jxH. If we now proceed to the nonstationary state we must strictly use the correct Maxwell relations. But there is a case of utmost practical importance, in which dD/dt <<C 4xw, and hence for which the effects of displacement can be neglected in comparison with those of the convection currents. The Maxwell equations with the displacement current omitted apply to the so-called "quasi-stationary" processes, and these form practically the whole domain of electrical engineering. The magnetic field inside and outside conductors is calculated as if produced only by the convection currents, but the induction law is not left out as in the stationary state. Here we have a double coupling of electric and magnetic fields, first, as in the stationary case, where electric currents produce magnetic fields, and, secondly, by the induction law. Since the essentially new contribution of Maxwell, the displacement current, is neglected in quasi- stationary calculations, it is clear that no study in that field can give experimental confirmation of Maxwell's idea. 153. The Vector and Scalar Potentials. â€” We observe that, if H depends on time, curl E 9^ 0, so that there is no potential for E. The ordinary electrical potential is thus confined to static problems. Further, if u or â€” 5^ 0, there is no potential for H . Uv We have seen in the last chapter how a potential can be intro- 242 INTRODUCTION TO THEORETICAL PHYSICS duced for H: one uses a vector potential A, possible because div H = 0. That is, we let H = curl A. (7) We can do this even in the general case. And it proves that we can use a scalar potential <Â£, reducing to the electrostatic poten- tial in the case of a steady state, but different in other cases, by a special device. The relation which proves to be satisfied is that E = -gmd <f> - j^, (8) reducing to the familiar E â€” â€” grad <Â£ when everything is independent of time. These relations are written for the case of empty space, where e = n = 1, and we shall give the discus- sion only for that case. To verify our statements about the vector potential A and the scalar potential <f> we substitute the expressions for E and H in Maxwell's equations, and see if they can be satisfied by the proper choice of A and <Â£. First, we notice that div H = div curl A = 0, so that this equation is automatically satisfied. 1 A Next we take div E â€” â€”div grad <f> ^ div A = â€” v 2 <Â£ â€” . c dt 1 ?j - â€” div A. This must equal 47rp. Now we consider the curl c at equations. We have curl E = â€”curl grad <t> ^ curl A. 1 ?\ Since the curl of any gradient is zero, this is â€” â€” curl A = 1 r)TT â€” > verifying another of Maxwell's equations. Finally curl H = curl curl A = grad div A â€” v 2 ^. This must equal 1 BE , 4ttw Id , Â± 1 d 2 A . 4ru â€ž . , c^t + â€”=â€”cdt grad *-?***- + â€” Hence > m order to satisfy Maxwell's equations, we must have -V 2 â€” div A = 4ttp, c at AAA 2 A . 1 Â° A J. I 1 9 * A 4â„¢ grad div A-v*A+~- grad * + - 2 ^ = â€” â€¢ But now let us choose A and <Â£ subject to the condition that ELECTROMAGNETIC INDUCTION 243 1 dtb div 4 H ~ = 0. Since div A is so far arbitrary, we can do this. Then the first equation becomes and the second 1 d 2 <f> W - -s -^ = -47TP, , , 1 d 2 A â€”4iru /m c 2 dt 2 c These are the equations for the potentials. If A and <Â£ satisfy them, then, as we stated before, the fields determined from them >/ ^ QjÂ± by the equations E = â€” grad <f> ^r-, H = curl A, satisfy Maxwell's equations. The equations for the potentials are of the form called D'Alembert's equation, and as can be seen are extensions of Poisson's equation, obtained by adding the time derivatives. We observe that in regions where there is no charge and current density, the potential satisfies the wave equation, which is the homogeneous equation obtained by setting the right side of D'Alembert's equation equal to zero. That is, <f> and A are given by functions representing waves traveling with velocity c. Hence the same thing must be true of the fields E and H. This is the origin of the theory of electromagnetic waves, and of the electromagnetic theory of light, and the proof that c, the ratio of the units, is at the same time the velocity of light. In regard to our condition imposed on the potentials, that 1 fith div A -\ â€” - = 0, we can readily sh6w that if the potentials satisfy Eqs. (9) above, this condition can also be satisfied. For take 1/c times the time derivative of the first, and the divergence of the second, and add. Using the fact that div y 2 A = v 2 div A, where A is any vector, the result is ' vÂ°(div a+\>Â±)-^ *W + i/ca*/a<) That is, the quantity div A + ' --Â£ satisfies the wave equation c at everywhere. It can be proved that no function, other than zero, can satisfy the wave equation everywhere, unless its value 244 INTRODUCTION TO THEORETICAL PHYSICS at infinity is different from zero. Hence in an ordinary problem of charges at finite points, where certainly the potentials must 1 r)th vanish at infinity, it must be that div A -\ ^ = 0, and in other cases we can certainly choose the potentials so that this condition will be satisfied. Problems 1 d 2 E 1. Show that E and H satisfy the wave equations v 2 E - â€” 2 -^ =0, with a similar equation for H, in empty space, where u and p are zero, and e = M = 1. (Suggestion: for the first, take the equation for curl E, and take its curl, then substitute for curl H in terms of E. Proceed in an analo- gous way with the other equation.) 2. In a region where u and p are zero, but e and p. are different from 1, 3how that the velocity of light is â€”==â– 3. A magnetic field points along the z axis, and its magnitude is propor- tional to the time, and independent of position. Find the vector potential. Assuming that the scalar potential is zero, find the induced electric field. Prove by direct integration using a circular circuit, that the law of induction holds. 4. Describe the magnetic field between the plates of a condenser while it is charging up. 6. Starting from the induction law, show that the line integral of (E + - â€” ^ around a closed path is zero, where A is the vector potential, c dt / From this show that the curl of the above vector vanishes and hence that 1 dA E = â€” erad 4> â€” Â» where <f> is the scalar potential. & C dt 6. In conductors where p, = 1 and p = show that E and H both satisfy differential equations of the form ,_ 4xcr dE e d 2 E _ _ v E ~ -& ~di ~ 7* ~w ~ Â°* 7. Derive the differential equations satisfied by E and H for quasi- stationary processes. 8. Show that if a voltage is induced in a circuit (2) by a changing magnetic field due to a circuit (1 ) , the induced e.m.f . in (2) is given by where' A i is the vector potential at the element ds 2 due to the current in circuit (1). For quasi-stationary processes we can write a M C C fWidVl ELECTROMAGNETIC INDUCTION 245 where ui is the current density in circuit (1) and dvi a volume element thereof. For linear currents show that the induced e.m.f. is then given by '#**- -5 !('â€¢#*Â£*)â– where Ii is the current in the first circuit, r i2 is the distance between dsi and dsi. The coefficient of mutual induction M i 2 is defined as -ss% ri2 so that the above relation becomes (E.m.f. ) 2 = -I j t (M 1 J 1 ). CHAPTER XXII ENERGY IN THE ELECTROMAGNETIC FIELD The idea of energy is as useful in electromagnetic theory as in mechanics. Maxwell's equations correspond in a general way to the equations of motion, and in the present chapter we introduce electrical and magnetic energies analogous to the potential and kinetic energies. The analogy is particularly close with the mechanical energy in a vibrating medium, since electrical oscillations in free space, as in a light wave, are similar to mechanical oscillations in sound. The energy of an elastic solid is distributed throughout the body, each volume element having a potential energy on account of its strain, and a kinetic energy on account of its velocity. Correspondingly we shall find that the electromagnetic energy can be considered as localized throughout the field, with a definite density of electrical and magnetic energy. Finally, the potential energy is propor- tional to the square of the stress or strain, and kinetic energy proportional to the square of velocity or momentum, and in a similar way here we shall find electrical energy proportional to the square of E or D, and the magnetic energy to the square of H or B. The analogy can be carried out completely, Maxwell's equations, for instance, being written in the form of Lagrangian equations; however, we shall not do this. We start the discus- sion by deriving the electrical and magnetic energy by elementary means from the condenser and solenoid, and then pass to general theorems involving energy density and energy flow. 154. Energy in a Condenser. â€” Given a condenser of capacity C, let its charge at a given moment be q. Assume that we are charging up the condenser, and that we wish to know how much work we shall have to do on it to charge it. To take a small additional charge dq around the circuit, against the difference of .potential q/C, will require an amount of work (q/C)dq. Thus the whole work done in setting up a charge Q is 246 ENERGY IN THE ELECTROMAGNETIC FIELD 247 * This is the expression for the energy in a condenser which we found in Chap. V, Prob. 6. But now there is an interesting way in which we may consider this. We may imagine that the energy resides directly in the electromagnetic field, between the condenser plates. Let the area of the plates be A, the distance of separation d, and the dielec- tric constant e, so that C = Ae/4rd. Also the field between the plates will be E = q/Cd, the difference of potential between the plates divided by the distance. Hence we have \q 2 /C = \E 2 Cd 2 = (eE 2 /8ir)(Ad). But Ad is simply the volume of the condenser, or of the region of space where the field is E. Hence we may consider the energy to be located in the electromagnetic field, with a volume density eE 2 /8ir, and the integral of this over the condenser will give precisely the total energy. 155. Energy in the Electric Field. â€” It is not difficult to show that in an arbitrary electrostatic field the energy is given by F = 2 y _ eie2 ^- I I I E 2 dv. Let us consider two point charges d and e 2 in a medium of dielectric constant e separated by a distance ri 2 . The force acting on each is given by Coulomb's law as er 2 i2 and the potential energy of the system by l/ e 2 , n e x \ eri2 2\ er i2 eri 2 / We have written this, in two terms and notice that the first is just the charge ei times the potential at the point where the charge is due to the charge e 2 . Similarly the second term is e 2 times the potential at e 2 due to e\. Thus we can write V = -Rifinpi + e 2 <p2) where <pi and <p 2 are the potentials. In general for n charges we have -12 o / i ^kfPk (2) k â– and if the charges are distributed in space instead of being point charges, this becomes an integral V = l[ [ [p<pdv (3) 248 INTRODUCTION TO THEORETICAL PHYSICS where p is the density of charge. Now, by Poisson's equation we know that vV = ~4xp/ e > so that the integral can be written = srj J J ' F 'srJJJ'" v *- We now make use of Green's theorem in its first form J7J>V 2 dv + J/Jgrad ^ â€¢ grad <t> dv = J/^ gradâ€ž <f> dS where 4> and ^ are any two scalar quantities. Place \p = <j> â€” <p and this becomes JJJVvV dv + J/J^ 2 dv = J JV grad n <p dS since i? = â€”grad <p. Now since we integrate over all space, we must examine the behavior of the surface integral as the surface (a sphere of radius R, for example) gets larger and larger. The potential <p varies as 1/R for large R, grad* <p as 1/R 2 and dS is proportional to R 2 , so the whole surface integral vanishes as R-* <x>. Thus sub- stituting in our expression for V, we find V = Â£- I I I EHv (4) m- 8x which is the equation we set out to derive. From our derivation it is easy to show that if e is not constant 8 J J J V = ^ | | \E-Ddv where D is the electric displacement vector. This shows us the origin of the name for D. If we think of D as an ordinary dis- placement (per unit volume) of electricity, then the work done per unit volume is the scalar product of the force times the dis- placement. In an infinitesimal displacement dD, the work per unit volume is proportional to E -dD = eE-dE and for a finite displacement D we get something proportional to I EdD=\ eEdE = -j- =â€” gâ€” Thus, except for the numerical factor 1/4*-, we have the potential energy per unit volume. 156. Energy in a Solenoid. â€” In a similar way, we may con- sider the magnetic energy in a solenoid to reside in the magnetic ENERGY IN THE ELECTROMAGNETIC FIELD 249 field within the coil. We have found earlier that the energy in a solenoid of self-induction L, in which a current i was flowing, was \IA 2 . But now we can easily write this in terms of the field H within the solenoid. We have seen that this field is 4rra, where n is the number of turns of the coil per centimeter. The coefficient of self-induction L for a coil is easily found. By defini- tion, it is the e.m.f . induced when there is unit time rate of change of current through the coil. The e.m.f. per turn = ^ (B X JTT cross-sectional area) = irr 2 n -rr> if r is the radius of the coil, n the permeability. Thus the e.m.f. for the whole N turns is JVrr*u â€” â€¢ Since H = 4xra = â€” r^> if N is the whole number of turns, dt d , , , . *r o (4*N)di ,, , T 47T 2 iVV 2 d the length, the e.m.f. is iWr> â€” -5â€” -& so that L = ^â€” â€” 1 T ., (27rWV 2 )/ ^ \ 2 M^V 2 ^ _ i*H*(irr*d) Hence we have ^ = a ^^/ g 8tt Since irr 2 d is the volume, this indicates a volume density of mag- . H 2 n netic energy 01 -^ â€” The proof that the total magnetic energy in a magnetostatic field is #- I ( I H 2 dv or ^ I I I # ' â€¢ B dv is carried out in exactly the same manner as the one for the electrostatic energy given in the last paragraph. 157. Energy Density and Energy Flow. â€” The examples we have considered suggest that in a combined electric and magnetic field there should be a volume density (l/&r)(e.# 2 + nH 2 ) of electromagnetic energy. As a matter of fact, it proves to be quite possible to make this assumption, and to carry it out in a logical way. One can regard the electromagnetic energy almost as a fluid, having a certain density, flowing from place to place in the field. Thus, there is a flow vector associated with it, calle'd Poynting's vector, which we shall show in the next section to be equal to (c/4ac){E X H). We shall prove that there is an equation of continuity for the energy: div[^ Xff )] + Â±[^ + ,ff*)]=0. (5) This is only true, however, in regions where electromagnetic energy is not being produced. Of course, energy as a whole is 250 * INTRODUCTION TO THEORETICAL PHYSICS conserved, but there can easily be sources and sinks of electro- magnetic energy. Thus batteries are sources, in which chemical energy is converted into electrical energy, and resistances are sinks, in which the electrical energy is converted into heat. We imagine the field as being worked on by the battery, and as doing work against the frictional resistance. Hence our whole relation is that d/dt (electromagnetic energy) = rate of production of energy from e.m.f. per unit volume â€” rate of dissipation of energy into heat â€” div (energy flow). This equation, put in mathemati- cal form, is Poynting's theorem. 158. Poynting's Theorem.â€” Let us compute the quantity d* [Â£ ( *xm] +![!(.*â– + ,*â€¢>]. It can be shown in general that div (A X B) = B â€¢ curl A - A- curl B. Also â€”=t = 2 A â€¢ -â€” -â€¢ Hence the expression is equal to ?-(h â€¢ curl E - E â€¢ curl H + *-E â€¢ *Â® + Â»H â€¢ ?**\ = 4tt\ c dt c dtj Â£\H.(cuTlE+lâ„¢)-E-(curlH-Â±â„¢)~- M. \ c dtj \ c dtj But by Maxwell's equations curl E -\ â€” = 0, curl H â€” c dt - -~r = â€” , so that the result is â€”E â€¢ u. Hence Poynting's theorem is div Â±(JE X H) + it* E2 + * iH2) ] = - E - u ' (6) From the analysis of the last section, we see that â€”E-u must represent the total rate of production of electromagnetic energy by e.m.f.s minus the rate of dissipation into heat. The latter is simple : in regions where Ohm's law holds, u = <tE, so that here we have the contribution â€” aE 2 to the right side. The quantity <rE 2 represents the ordinary dissipation of energy into heat. We must examine the other sort of term, the external e.m.f., rather more carefully. 159. The Nature of an E.M.F. â€” In a conductor carrying a current, there will be a current u set up, equal to the total force ENERGY IN THE ELECTROMAGNETIC FIELD 251 per unit charge, times <r. The force is ordinarily simply the electrical force E. But sometimes there are other sorts of force acting. For example, in a battery, the various concentrations of electrolytes produce a definite pressure on the ions, forcing them mechanically in one direction, and this force would not ordinarily be considered as being electrical in nature. Inside a battery, the electric field is actually opposite to the flow of current, point- ing from positive pole to negative, while the current flows from , negative to positive. But the additional force acting on the charges counteracts the electric field, and does enough more sO that it can push the current through the internal resistance of the battery. This latter part is already taken care of in computing the work done by the resistance. The former part, just equal and opposite to the E in the battery, is the force responsible for the applied e.m.f. of the battery. Thus it is â€” E per unit charge. And the rate of working of the force on unit charge is the force times the velocity of the charge. We actually wish the rate of working per unit volume, so that we must multiply by the charge per unit volume. This is p, and its product with the velocity is just the current density u, so that we have â€” E â€¢ u as the rate of working of the e.m.f. on the electrical system. This is just the contribution to the right side of Poynting's theorem which we should get inside the batteries. 160. Examples of Poynting's Vector. â€” The conception of the energy of the electromagnetic field as residing in the medium is a very fundamental one, which has had great influence in the devel- opment of the theory. Thus Maxwell thought of the medium as resembling an elastic solid, the electrical energy representing the potential energy of strain of the medium, the magnetic energy the kinetic energy of motion. Such a definite view is no longer held. Nevertheless, the energy is always believed to travel through space. Thus, in a light wave, there is a certain energy per unit volume, proportional to the square of the amplitude (E or H). This energy travels along, and Poynting's vector is the vector which measures the rate of flow, or the intensity of the wave. We shall show that the vector actually points along the ray of light, the direction of flow. If, for example, we have a source of light, and we wish to find at what rate it is emitting energy, we surround it by a closed surface, and integrate the normal component of Poynting's vector over the surface. The whole conception of energy being transported in the medium is 252 INTRODUCTION TO THEORETICAL PHYSICS evidently quite fundamental to the electromagnetic theory of light. When we come back to charges and currents, however, it is a little harder to see the significance of the energy in the medium. For example, in a circuit consisting of a battery, and a wire con- necting the plates, Poynting's vector indicates that the energy flows out of the battery, through the space surrounding the wire, and finally flows into the wire at the point where it will be trans- formed into heat. This seems to have small physical significance. In a moving electron, the situation is somewhat more reasonable. Suppose that the electron at rest is to be represented by a sphere of radius R, on the surface of which the charge is distributed. Then the field will be e/r 2 at any point outside the sphere. The 1 e 2 total electrical energy is the volume integral of ^ - t over all space outside the sphere, or 1 * f "^dr = &r Jr r 4 2R In the theory of the electron, it is this quantity which is inter- preted as being the actual constitutive energy of the electron; although a correction must be made of an additional energy required to keep the sphere in equilibrium. Neglecting this correction, we can compute the mass of the electron. For a relation of Einstein says that a given energy has a mass, given by the relation, energy = rac 2 . Hence mc 2 = e 2 /2R. Solving for the radius, we have R = e 2 /2mc 2 , a familiar formula for the radius of the electron. The correct formula, inserting the correc- tion we omitted, differs only by a small factor. Inserting the correct values of e = 4.774 X lO" 10 e.s.u., m = 9.00 X lO" 28 gm., c = 3 X 10 10 cm. per second, we have R = 1.41 X 10~ 13 cm. Now if this electron moves, it will have a magnetic field, as a current would, and hence will have a certain magnetic energy. Since the magnetic field is proportional to the velocity (or the current), the magnetic energy is proportional to the square of the velocity. This can be shown to be the kinetic energy. Further, there will be a Poynting vector, pointing in general in the direc- tion of travel of the electron, and representing the flow of energy associated with the electron. All these relations prove on closer examination to be more complicated than they seem at first sight; but they lead to a consistent theory of the nature of the electron. ENERGY IN THE ELECTROMAGNETIC FIELD 253 It should be stated, however, that this theory does not fit in with the quantum theory, and that its correct form on the basis of that theory is not known at present. 161. Energy in a Plane Wave. â€” Let us compute the flow of energy in a plane wave of light. It was shown in the last chapter and its problems that the potentials and fields satisfy a wave equation of the form V & - "2 ~M ~ U > (7) c* dt 2 corresponding to propagation with the velocity v = c/n, where n = Vw- Here n, the ratio of the velocity of light in empty space to the velocity in the y medium we are interested in, is the index of refraction. It is easy to set up a plane wave solu- tion of the wave equation. Thus a wave of frequency v, propagated in a direction whose direction cosines are/, g, and h, is represented by E = E e L c J . (8) E is a constant vector, measur- ing the amplitude of the wave. fig. 42.- The exponent is constant, rep- resenting constant phase, or a wave front, when/r + gy + hz = (c/n) t = vt. Now/i + gy + hz is the projection of the radius vector x, y, z on the direction /, g, h, so that, as we see from Fig. 42, all points for which fx + gy + hz is constant lie on a plane whose normal is /, g, h, and whose distance from the origin is given by the constant. If this constant is vt, the plane travels out with a velocity v, as a wave front should. To have a wave of arbitrary phase, E would have to be a complex vector. We can immediately show by substitu- tion that the wave as we have written it is a solution of the wave equation. For instance, dE/dx = -(2Trivnf/c)E, and carrying out the various differentiations and substitutions, and making use of the relation f + g 2 + h 2 = 1, the result follows at once. Having the form of the solutions for E and H, we may apply Maxwell's equations. We note that the wave equations separate -Plane wave front AB, satisfy- ing equation fx + gy + hz = constant = distance OB. 254 INTRODUCTION TO THEORETICAL PHYSICS E and H completely, but Maxwell's equations prescribe rela- tions between them, so that actually Maxwell's equations are more restrictive than the wave equation. First, we cannot hope to satisfy the relations unless E and H both have the same exponential factor, corresponding to the same frequency and wave normal. Assuming this to be true, we can apply the equa- tions in succession. Let us first take div D = 0. This leads at once to â€” â€¢ (fE x + gE y + hE z ) = 0, showing that the scalar product of unit vector along the wave normal, which we may call k, and E, is zero. In other words, E and D have no compo- nent along the wave normal, or are in the plane of the wave front. Similarly div B = shows that B and H are in the plane of the wave front. Next take the curl equations, beginning with curl H = - â€” . This gives for its z component c at -*â„¢p{gH z -m y ) = I (2*iv)E x , which is the x component of E = -% X H), = -^0 X H), showing that E is at right angles both to H and the wave normal, these three then forming a set of three orthogonal directions. Further, since k and H are at right angles, the magnitude of E equals V 'n/e times the magnitude of H. The fourth equation can be easily shown to lead to the same condition. Now we find the energy density. It is evidently 1 eE 2 as we see from the relations between E and H. Setting E = E cos 2wv\ t (fx -}- gy .+ kz) , and squaring, we have a quantity oscillating with time, but its time average, which alone has physical significance, is E Q 2 /2. Hence the mean energy density is eE 2 /Sr. Next, Poynting's vector, being at right angles to E and H, is along k, as it should be. Its magnitude is (c/4t)E X s/T/JxE, so that its mean is (c/8t)\/Â«7m-^o 2 , or c/\/efi times the energy density. But this is the result we ENERGY IN THE ELECTROMAGNETIC FIELD 255 should expect. This energy would be contained in a volume 1 sq. cm. in cross section, and of length v = c/ve/* cm. But if the light moves with a velocity v along the long axis of the volume, this energy will cross the 1 sq. cm. in one second, so that it should represent the flow vector, or Poynting's vector. 162. Plane Waves in Metals. â€” Let us consider the propagation of a plane wave in a metallic conductor, where for simplicity we shall take fi = 1, p = 0, but u = aE. Rather than satisfying the wave equation first and then substituting in Maxwell's equations, as we did in the preceding case, we shall vary the procedure by assuming a wave with undetermined velocity, and satisfying all four of Maxwell's equations (in the preceding case only three of Maxwell's equations, and the wave equation, were actually used, Maxwell's fourth equation being auto- matically satisfied). Let us then assume that E and H are given by expressions of the form E e L C J , (9) where a is to be determined. The divergence equations show as before that E and H are both in the plane of the wave front. The equation for curl E leads to a(n X E) = H, showing as before that E and H are at right angles to each other, and that the magnitude of H is a times the magnitude of E. The equation curl H = - -rr H E gives a new condition, c at c -^(* xh) = (^iv + *â„¢ta. c \c ' c J This condition likewise shows that E and H are at right angles to each other, but now gives the magnitude of H equal to -( e -J times the magnitude of E. These conditions are only consistent if Â«-!(._ ^), Â«. - Â« - 3fe. (io) a\ v / v We see, in other words, that a, the quantity corresponding to the index of refraction, is complex. Let us write a = n â€” ik f where n and k are real, so that, as we can easily see, n 2 â€” k 2 = â‚¬, nk .= <r/v } 256 INTRODUCTION TO THEORETICAL PHYSICS and n = [Kv V + 4<r 2 A 2 + 6)]*, (n) fc = [KVÂ« 2 + 4<r 2 /v 2 - â‚¬)]*. To understand the meaning of n and k, we substitute in the origi- nal expression for the plane wave, Eq. (9). This can be written E e' -^^(fx+gy + hz) 2*iÂ»f t--(fx+OV + hz)~\ c p L c J The second factor is just like an ordinary plane wave, with index of refraction n, though since n depends on frequency, we find the Maxwell theory predicting dispersion of electromagnetic waves in metals. But the first factor, a pure exponential term decreas- ing as fx-\-gy + hz increases, means that there is a decrease of amplitude and energy as the wave travels along, or an absorption, as we can easily see from an application of Poynting's theorem, computing the Joule heating within the metal. For this reason k is called the absorption coefficient. We have found that the magnitude of H is a, or n â€” ik, times the magnitude of E. If we write the complex number n â€” ik in the exponential form, we have n - ik = \/n 2 + k 2 e^ 2 â„¢\ Ik where 8 = K â€” tan -1 -> and 2irv n 17-71 n /, , m - 2j T*(fx+gV+hz) 2*iJt-;(fx+gy+hz)-8~\ \H\ = EzS/n 2 + k 2 e c e L c J , so that there is a phase difference between E and H in a conductor, whereas in an insulator they are in phase. The details of the calculation of electric and magnetic energies are left to a problem. Problems 1. If the generation of heat per cubic centimeter in a conductor carrying a current is crE 2 , prove that for a cylindrical conductor of resistance R, carrying a current i, the rate of generation is i 2 R. 2. Given a cylindrical wire carrying a current. Find the values of E and H on the surface of the wire, computing Poynting's vector, and show that it represents a flow of energy into the wire. Show that the amount flowing into a given length of wire is just enough to supply the energy which appears as heat in the length. Note that the surface of a wire carrying current is not an equipotential so that there can be a component of electric field parallel to it. 3. Prove div (A X B) = B â€¢ curl A - A â€¢ curl B. ENERGY IN THE ELECTROMAGNETIC FIELD 257 4. The maximum electric field in a light wave is 0.1 volt per centimeter. Find how much energy is transported by the beam across 1 sq. cm. per second. 6. Given a 40-watt lamp, and suppose that all its energy is dissipated in radiation of one wave length or another. Take a sphere of radius 1 m. surrounding it, and suppose the radiation is of equal intensity in all direc- tions. Find the maximum electric field in the radiation at this distance, in volts per centimeter, and the maximum magnetic field in gauss. Find the energy per cubic centimeter at this distance, in ergs per cubic centimeter. 6. Apply Poynting's theorem to the case of a plane wave traveling in a conductor and show that the rate of dissipation of electromagnetic energy just equals the Joule heating. 7. Calculate the electric and magnetic energies in a plane wave traveling in a metal and show by direct comparison that they are different from each other. What happens in the limiting cases <râ€”*0 and <râ€” *â€¢<Â», i.e., insulators and perfect conductors? 8. Investigate the behavior of n and k for a metal as functions of fre- quency, drawing curves. Take e = 1, and take the conductivity of copper. Note that the conductivity in electrostatic units has the dimensions of a frequency, and find in what part of the spectrum this frequency lies. Show that the value of e is only significant when the frequency becomes greater than a\ 9. The significance of Â«â– as a frequency is found from the relaxation time, the time taken for a volume charge set up within a metal to die down to 1/e th of its original value. Derive this in the following manner. Set up the equation of continuity for the current density u and charge density p. In this, write u in terms of E by Ohm's law, and write the result in terms of p by the relation e div E = 4?rp. Solve the resulting differential equation for p, showing that the solution is p = poe~ t/r , where t, the relaxation time, is e/ixcr, so that a- is, as far as its order of magnitude is concerned, the fre- quency connected with the relaxation time. CHAPTER XXIII REFLECTION AND REFRACTION OF ELECTROMAGNETIC WAVES According to the electromagnetic theory of light, light con- sists of electromagnetic waves, propagated according to Maxwell's equations. We have already seen how we are led to the wave equation for E and H, or for the potentials, and we have investi- gated the plane wave solutions of these equations, showing that E and H are at right angles to each other and to the direction of propagation, the latter being the same as the direction of Poyn- ting's vector, giving the energy flow. We shall now investigate the electromagnetic theory of some simple optical phenomena, beginning with reflection and refraction. 163. Boundary Conditions at a Surface of Discontinuity. â€” We have seen in the last chapter the conditions that hold for a wave in a refracting medium, whose index of refraction is con- stant. In the problem of reflection and refraction at a boundary between two media, however, the index changes suddenly from one medium to the other, and we must investigate what happens there. Let us assume that the boundary is a plane normal to the z axis. Then we shall apply Maxwell's equations, in the inte- grated form, to small regions containing the boundary. Thus take a thin flat volume, its faces parallel to the boundary and containing it. Let the area of the face be A. Apply to the above the divergence theorem, div D = 4wp, or ///div D dv = ffD n dS = 47r<7, where q is the total charge within the volume. The surface integral comes almost wholly from the flat faces; it is A(D n2 â€” Dni), if D 2 is the value of D in the upper medium, Di in the lower. If now the surface is uncharged, q gets smaller and smaller as the volume becomes thinner, so that in the limit A(D n2 - Dm) = 0, or Dâ€ž 2 = D nl . That is, the normal com- ponent of D is continuous at an uncharged surface. Next let us apply the curl equations, to contours of the follow- ing sort: infinitesimal contours of long thin shape, in which one long side is in one medium, the other in the other, parallel to 258 REFRACTION OF ELECTROMAGNETIC WAVES 259 the surface, and the parts of the contour which cross over from one medium to the other are of negligible length compared with the long sides. Consider curl H = - â€” H > or integrated, Â° c at c Cll 8 ds = f [(\^f+ ^~) dS - If there is no surface cur- rent, D and u are finite vectors, so that as the contour gets narrower and narrower, and the area smaller and smaller, the right side of this equation will vanish. The left side approaches (H a2 â€” Hâ€ži)L, where L is the length of the contour, H s i and H s2 are the tangential components of H in the media 1 and 2, respectively. Thus finally we have H s2 = H sl , or the tangential component of H is continuous. Similarly we show that the tangential component of E is continuous. Now we can see how to solve a problem involving two media separated by a plane surface, as air and glass. In one medium, we assume a plane wave approaching the boundary. But it must stop at the boundary, for the same plane wave, with the same wave length, would not be a solution of the problem for the second medium. There must be some wave in the second medium, however, for otherwise the boundary conditions could not be satisfied. Thus we are led to the existence of the refracted ray. As a matter of fact, we find that we cannot satisfy the boundary conditions without an incident, refracted, and also a reflected ray. By using all these, with proper relations between direction, amplitude, etc., we can actually satisfy the boundary conditions at the surface of separation of two media. 164. The Laws of Reflection and Refraction. â€” Assume a plane wave in the first medium, striking the surface of separation. This wave will have the form e **'v Â» '. Let the surface of separation be given by z = 0, the xy plane. Further let the axis be so chosen that the wave normal is in the xz plane, as in Fig. 43, so that m = 0. Then at points of the surface of 2Triv(t -) T . - . separation the disturbance is given by e >. Â»/. it is tins disturbance which, taken together with the corresponding expressions from the reflected and refracted waves, must satisfy certain boundary conditions. Next we consider a possible refracted wave. It will be in 2 '(t l ' x + m 'v+ n ' z \ general of the form c *n v ' ' , so that in the surface 260 INTRODUCTION TO THEORETICAL PHYSICS of separation it will reduce to the value of this with 2 = 0. The boundary conditions must be satisfied for all values of x, y, and t, and yet we have only one constant at our disposal, an amplitude, in addition to the frequency and direction. It is obvious that the only possibility of satisfying the conditions will come if we make v' = v, V '/v' = l/v, m' = 0. For then we shall have just the same function of x, y, and t for both incident and refracted waves, at all points of the boundary. First, then, the refracted wave must have the same frequency as the incident one. Next, if the inci- dent wave normal is in the xz plane, this must also be true of the refracted wave. Finally, there is a relation between the angle of incidence and the angle of refraction. We have I = cosine of the angle between the wave normal and the x axis = sine of the angle between the wave normal and the normal to the surface = sin i, where i is the angle of incidence. Simi- larly, V = sin r, where r is the angle of refraction. Thus we have Fig. 43. â€” Law of refraction. sin 7v â€” â€” = -t = index of refraction of the second medium with sin r v respect to the first. In other words, we have the ordinary law of refraction, as a necessary consequence of the boundary conditions. Similarly, for the reflected wave, moving in the first medium, â€¢we see that m must be equal to zero, and I equal to the value for the incident wave, showing that the angle of reflection equals the angle of incidence. Now the reflected wave must be different from the incident wave, and to do this we must have the n for the reflected wave the negative of the value for the incident one, showing that the reflected wave travels away from the surface rather than towards it. 165. Reflection Coefficient at Normal Incidence. â€” After prov- ing the laws of reflection and refraction, we still have much more to do to apply the boundary conditions. For we must compute the values of the various vectors at the surface, and actually satisfy the conditions. Let us take first the simple at REFRACTION OF ELECTROMAGNETIC WAVES 261 case of normal incidence, where I = 0, and all waves travel along the z axis. Let us suppose that in the incident beam we have E along the x axis, H along y. For simplicity we assume the first medium to have the index of refraction unity, the second the index n = s/l. Then in the refracted wave we assume that E is along the x axis, H along y, and that the value of E is E', so that H' = nE'. In the reflected wave, assume that E has a changed phase, H not, so that E is along â€”x,H along y, and each numerically equal to E" . The change of phase of one vector and not the other is necessary to reverse the direction of the Poynting's vector. Now we may apply the boundary conditions. All normal components are zero, so that these conditions are automatically satisfied. For the tangential component of E, we have E â€” E" = E'; for the tangential component of H , H + H" = H' '. The latter is then E + E" = nE'. Combining the two, we have op E'(n â€” 1) ! once E' = -=^ (by adding), and E" = â€” ^ L (by sub- -pin ~i i tracting), leading to -^ = , . - This gives us directly the reflection coefficient at normal incidence. The ratio of reflected to incident intensity is proportional to the ratio of the squares of ( n _ 1)2 the amplitudes, or K ' â€¢ This shows that the reflected intensity is never so great as the incident, but that the ratio approaches closer and closer to unity as n becomes larger. It is interesting to compute the reflection coefficient for familiar substances. For instance, for glass, n is about 1.5, so that the coefficient is (0.5/2.5) 2 = 1/25, showing that only a few per cent of the intensity is reflected from a glass plate at normal incidence. > We can check the energy relations: the amount of energy brought to the surface per unit time in the incident wave should equal the amount carried away in the refracted and reflected waves. The first is Â£-(E X H), whose magnitude is -^-E*. The reflected energy is j- , T iL ^ 2 - The refracted intensity is 4ir (n + l) z UE' X H') = -%-nE'* = -Â£- . ^Ix^ . The sum of the 4t ' 4t 47r (n + l) 2 262 INTRODUCTION TO THEORETICAL PHYSICS refracted and reflected intensities is c 47T (n â€” l) 2 + 4/i ]* - hF {n + l) 2 equal to the incident intensity. 166. Fresnel's Equations. â€” Now we pass to Fresnel's equations, the extension of the last section to an arbitrary angle of incidence. Here, for the first time, we meet the question of polarization. The vector E is at right angles to the direction of propagation, but that does not fix the direction uniquely, and it is said that the wave is polarized in a particular direction if its electric vector points in that direction. Let us then consider the two extreme up. Fig. 44. â€” Vectors in reflection and refraction. Case 1. y axis points down into the paper. E and E' point down, E" points p. _ , Case 2. H, H', H" all point down. cases. We take the wave normal of the incident wave to be in the xz plane, as before. Then we consider the case where the electric vector is along the y axis, and the case where it is in the xz plane, as in Fig. 44. Case 1. Electric vector along the y axis. All vectors depend on space in the following way, rewriting I, m, n in terms of the angles of incidence and refraction : for the incident wave, for the refracted wave, â€ž . /. x sin r-\-z cos r\ 2mv\ tâ€” ; I for the reflected wave, 2iriv(t â€” 0. \ a; Bintâ€” zcos i (1) (2) (3) REFRACTION OF ELECTROMAGNETIC WAVES 263 We take E and E' to be along the y axis. Then H is in the xz plane, at right angles to the wave normal. That is, for the incident wave, H x = â€”E cos i, H z = E sin i. Similarly, in the refracted wave, HJ = â€”nE' cos r, HJ = nE' sin r, and for the reflected ray HJ' = â€”E" cos i, HJ' = â€” E" sin i. Hence, we have the following relations : Normal component of D : nothing, since D is tangential. Normal component of B : E sin i â€” E" sin i = nE' sin r. Tangential component of E: E â€” E" = E'. Tangential component of H: â€”E cos i â€” E" cos i = â€”nE' cos r. sin % Remembering that â€” â€” â€¢ = n, the first two equations reduce to the same equation, E - E" = E' . The last is E + E" = n cos r E' _. tan i â€ž ,, . , ... , . ,, _ , : â€” = E From this at once, multiplying the first cos i tan r by 7 > and subtracting, we have jJtoti _ A = Jten.' \ \tanr / \tanr / E" tan i â€” tan r sin i cos r â€” cos i sin r E tan i + tan r sin i cos r + cos z sin r E" sin (* - r) E sin (t + r) (4) This gives the amplitude of the reflected wave, and is one of Fresnel's equations. We note that as i and r become zero, the law of reflection becomes i/r = n, i = nr. Thus in the limit of normal incidence, the ratio approaches {nr â€” r)/(nr + r) = (n â€” l)/(n + 1), as we found above. We also note, in the other extreme of tangential or grazing incidence, that i = 90 deg., so that the ratio is â€” â€” ,._ , g ' , â€” { = 1. That is, the sm (90 deg. + r) ' reflection coefiicient equals unity for grazing incidence. The formula gives a gradual increase of amplitude as the angle of incidence increases. Case 2. Electric vector in the xz plane. Let H be along the y axis in all the waves: H y = E, H y ' = nE'; H y " = E". Then we take E x = E cos i, E z = â€”E sin i, EJ = E' cos r, EJ = â€” E' sin r, E x " = â€”E" cos i, E z " = â€”E" sin i. Then we have: Normal component of D: â€”E sin i â€” E" sin i = â€”n 2 E' sin r. 264 INTRODUCTION TO THEORETICAL PHYSICS Normal component of B : nothing. Tangential component of E: E cos i â€” E" cos i-â€” E' cos r. Tangential component of H : E + E" = nE'. Using the law of refraction, the first and last are the same, E + E" = nE'. The other is E - E" = E'^â€”.- Multiplying COS % the first bv â– > the second by n = -. â€” > and subtracting, we J cos i sm r have / cos r _ sin A _ â„¢//_ cos r _ sin_A \cos i sin r) \ cos * sin r/ or iÂ£" _ cos r sin r â€” cos i sin i ^ ~ cos r sin r + cos i sin i Now we see at once that sin (i Â± r) cos (i + r) â€” (sin i cos r Â± cos i sin r) (cos i cos r + sin i sin r) = sin i cos z (cos 2 r + sin 2 r) Â± sin r cos r (sin 2 i + cos 2 i) = sin i cos z Â± sin r cos r. Hence we have E" _ sin (i â€” r) cos (t + r) __ tan (t â€” r) ,_. 1 sin (i + r) cos (i â€” r) tan (t + r) This is the other of Fresnel's equations. 167. The Polarizing Angle. â€” In Case 2 of Sec. 166, where the electric vector is in the plane of incidence, or the xz plane, we notice an interesting fact. If i + r = 90 deg., a perfectly possi- ble situation, we have tan (i + r) = Â« , so that E"/E = 0. That is, the amount of reflected light, at this angle, is zero. There is no such situation for the other sort of polarization. Sup- pose, then, that we take an unpolarized beam, such as would, be emitted by any ordinary source, and reflect it from a mirror at this angle, called the polarizing angle. The reflected light will consist entirely of the light polarized with the electric vector at right angles to the plane of incidence. It was by this phe- nomenon that polarized light was first discovered. Light was reflected from one mirror at this angle. Then its polarization was found by reflecting from a second mirror at the same angle. As the second mirror was rotated about the beam as an axis, so that the Dolarization changed from being at right angles to REFRACTION OF ELECTROMAGNETIC WAVES 265 the plane of incidence to being in the plane, the doubly reflected beam changed from a maximum intensity to zero. The polarizing angle r' is fixed by i' + r' = 90 deg., and this occurs when cos i' = sin r' . Using the law of refraction, we find tan i' = n, thus fixing the definite angle i' . For glass the angle of polarization is 56 deg. 168. Total Reflection. â€” For light passing from a dense medium with index of refraction n to a vacuum of index 1, the law of refraction is n sin i = sin r. For the angle of incidence given by sin i = 1/n, we have sin r = 1, r = 90 deg., and the refracted ray emerges at grazing incidence. For larger angles of incidence, sin r is greater than 1, and there is no real angle r. Physically we know that at these angles, greater than the critical angles, there is total reflection, with no transmitted beam. We can easily investigate the situation mathematically. In the first place, let us consider the disturbance in the second medium, for we find there is a disturbance, even though no trans- mitted beam is observed. This is given by an exponential _ . /. xainr + z cosr\ 2wlv[ t 7â€” I e ^ ', where we remember that the second medium has index 1, velocity c. But cos r = Â± \/\ â€” sin 2 r = Â± \/â€”l-y/n 2 sin 2 i â€” 1, a pure imaginary. Thus the exponential becomes where we have used the negative square root. The first term represents a wave propagated along the x axis, or parallel to the surface of the medium, with an apparent velocity c/sin r, a value less than c. The second factor indicates that the amplitude of this wave is damped out as z increases, or as we go away from the surface, so that the wave fronts (surfaces of constant phase) are at right angles to the surfaces of constant amplitude. This disturbance ordinarily damps out in a very short distance. Thus if n 2 sin 2 i is decidedly greater than 1, the exponential becomes small when z is a few wave lengths (vz/c a reasonably large num- ber). Consequently the disturbance is not observed. It is easily shown that Poynting's vector for this wave has no com- ponent normal to the surface, so that it does not carry any energy away. 266 INTRODUCTION TO THEORETICAL PHYSICS The reflected wave may be treated by Fresnel's equations. Thus, in Case 1 we have E" _ co s i sin r â€” sin i cos r _ a â€” ib ~E cos i sin r -j- sin i cos r a + ib where a = cos i sin r, 6 = â€” sin i\/n 2 sin 2 i â€” 1. This ratio can now be written as â€” e - 2it * n ~ lb / a , so -that E" and E are of the same magnitude, showing that all the light is reflected, but they differ in phase. We may write E e ' where Si -\/sin 2 i â€” 1/n 2 tan -^ â€” : 2 cos i Similarly for Case 2 we have E" _ cos i sin i â€” cos r sin r _ c â€” id 157 cos i sin i + cos r sin r c + id where c = cos i sin *, d = â€” sin r\A* 2 sin 2 i â€” 1. Again all the light is reflected, but with a change of phase 5 2 given by ET E where ^ gl'82 8 2 n 2 \/sin 2 i â€” 1/n 2 tan tt = râ€” 2 cos z Thus, in the general case, where E has components both in the xz plane and along the y axis, there is a difference of phase between these components upon total reflection, and linearly polarized light in general will become elliptically polarized upon total reflection. To see this, we note that two vibrations at right angles, with the same frequency and phase, produce a resultant vector whose extremity moves in a line (plane polarization), but if the two components are in different phases the extremity of the vector traces out an ellipse. If the phases differ by 90 deg., and the amplitudes are equal, the polarization is circular. It follows from our expressions for 8 X and 5 2 that the difference between these phase angles, which we denote by 8, is given by the relation 8 cos i\/sin 2 i â€” 1/n 2 tan k = 2 sin 2 i REFRACTION OF ELECTROMAGNETIC WAVES 267 Only in the case of grazing incidence, i = tt/2, does 5 = 0, so that our above remarks hold valid except in this case. It is clear that by causing an elliptically polarized beam to be totally reflected at the correct angle, it can be transformed into a beam of linearly polarized light. 169. The Optical Behavior of Metals. â€” We shall now examine the law of reflection for light falling on metals, restricting the discussion to the case of normal incidence. In the last chapter we have already shown that in the case of metals we must introduce a "complex index of refraction," n' = n â€” ik, where k is the extinction coefficient, and in so doing we retain the identical form of the relations which we have been using in this chapter. We have already found (/* = 1) n = V|(Ve 2 + 4(r 2 7^""rfe) k = Vkv^+i^t^ - e ) where <r is the conductivity, v the frequency, and e the dielectric constant, e is unknown for metals, but since <r for metals is so large (in e.s.u. a = 10 18 ), we can neglect e at least for light of sufficiently long period. Thus we find n = k = y/a/v (7) relations first found by Drude. For the infra-red, s/a/v > > 1. We may still use Fresnel's equations as we have done for total reflection. For normal incidence these are simply WL = n ' ~ 1 . E ri + l' and we must insert a complex value of n' for reflection from metals. Thus we have E" = E n ~ n + 1 â€” ik and taking the square, we find for the ratio of the reflected to the incident intensities, â€ž = ft 2 + k 2 - 2n + 1 = \n - l) 2 + fc 2 ' . n 2 + F + 2n + l (n + l) 2 + fc 2 ' W R is known as tHe reflective power of the metal. Since n = k, we may write < 7P = -, 4n â– 2n 2 + 2n + 1 268 INTRODUCTION TO THEORETICAL PHYSICS and since n = \/<t/v > > 1, this becomes c k E.-l-?-l-J or R = 1 - -r=r < (Â») This relation holds experimentally in the far infija-red, down to X ^ 5a*. The reflective power varies with the cQlor of the inci- dent light, and colors which are strongly absorbed are also strongly reflected. Problems I 1. Light is reflected from glass of index of refraction 1.(5. Compute and plot curves for the reflected intensity as a function of angle, for both sorts of plane polarization. 2. Find the intensity of light in the refracted medium, lor arbitrary angle of incidence and both types of polarization. Show th^it the amount of energy striking the surface is just equal to the amount carried away from it. Note that the amount striking the surface is computed, not from the whole of Poynting's vector, but from its normal component. 3. Show that the reflection coefficient from glass to air at normal incidence is the same as for air to glass, but that the phases of the reflected beams are opposite. 4. Light passes normally through a glass plate. Find; the weakening in intensity on account of the reflection at the faces. 5. Ten plates of glass of index 1.5 are placed together and used as a polarizer. Light strikes the plates at the polarizing angle, and the trans- mitted light is used. Since all the reflected light is of on^ polarization, and the reflections at both surfaces of all plates are enough to Remove practically all of the light of this polarization, the transmitted light Will be practically polarized in the other direction. Find the intensity of both sorts of light in the transmitted beam, assuming initially unpolarize4 light, and hence show how much polarization is introduced. You may; have to consider multiple internal reflection. 6. Derive the expressions for tan Si/2 and tan 5 2 /2 in the paragraph on total reflection. 7. Derive the formulas for the phase difference 5 of! the two reflected components of E in the case of total reflection. 8. The conductivity of copper in e.s.u. is 5 X 10 17 per second. Calculate the reflective power of copper for wave lengths of light jX = 12j* and X - 25.5/*. The observed values of 1 - R are 1.6 per cent ai|d 1.17 per cent at these wave lengths. 9. Consider light linearly polarized so that the incident electric vector has equal components in the plane of the wave normal and) the perpendicular REFRACTION OF ELECTROMAGNETIC WAVES 269 thereto. If this light falls on a metal, using Fresnel's equations find the ratio of the reflected components of E. If this ratio is written as pe*'Â« show that â€ž 1 â€” pet's ^ sin i tan i 1 + peis ~ yV 2 -sin 2 / where i is the angle of incidence and n' the complex index of refraction of the metal. CHAPTER XXIV ELECTRON THEORY AND DISPERSION Maxwell's theory and Maxwell's equations are based on the assumption of dielectrics with dielectric constant e, magnetic substances with permeability m, conductors with conductivity <r. These assumptions are unsatisfactory for two reasons. First, cases are known, and in fact are usual rather than exceptional, in which the three constants mentioned are not really constants. Thus the permeability of iron depends on the field strength. The dielectric constant of almost all substances depends on the frequency; as we have seen, the index of refraction n is given by the relation n = \/~e, and the well-known phenomenon of dispersion shows a dependence of refractivity on wave length or frequency. An extreme case is water, whose index of refrac- tion in the visible is about 1.4, and whose dielectric constant is 80, a result of the fact that the dielectric constant is measured for static fields, and that n as a function of frequency goes from V80 at zero frequency, through a region in short radio or long infra-red waves in which the index greatly decreases, so that with the very high frequency of visible light it is reduced to 1.4. The second reason why Maxwell's assumptions are unsatisfactory is that, since matter is known to be composed of electric charges, electrons with negative charges and atomic nuclei with positive charges; it ought to be possible to explain these typically electrical properties of matter directly in terms of the electronic structure, without having to resort to empirical relations of the sort implied by a constant or variable dielectric constant. The attempt to derive the. electrical properties of matter from the electron theory was first made by H. A. Lorentz, and he was successful not only in explaining the physical meaning of the dielectric constant, permeability, and conductivity, but in deriving their dependence on frequency, field strength, etc. Further develop- ments of the theory, making use particularly of wave mechanics, have carried the subject much further than Lorentz was able to. 270 ELECTRON THEORY AND DISPERSION 271 and in our later chapters on wave mechanics we return to these questions. 170. Polarization and Dielectric Constant. â€” The fundamental physical fact about a dielectric is that, when placed in an electric field, it acquires surface charges on its faces, proportional to the strength of the field. Thus in Fig. 45, a slab of dielectric is shown with positive and negative surface charges, as if the posi- tive had actually been pulled along to the face by the action of the field, the negative pushed to the other face. These surface charges, of course, contribute to the field, just as do other charges, which we actually have control over. The essence of the electron theory is that it treats these induced surface charges in the same way as any other charges, applying the ordinary Maxwell's equations to all charges in existence, and not considering dielectrics as being essen- tially different from free space, except in so far as they contain these polarizable electrons. Thus, if p and u are charge density and current density, respectively, of the so-called "real charge" which we can move about at will, and p p and u p the charge and current density of the charge arising from polarization, we assume Maxwell's equations for a nonmagnetic medium are . â€ž 1 BE . Air, , , . â€ž 1 dH curl H = - â€” + â€”{u + u p ), curl E = â€” C dt C C dt Fig . 45. â€” Polariza- tion of dielectric. div E = 4tt(p + P p), div H = 0. (1) In other words, we assume that the field E is the field of all charge, both "real" and polarization charge, and that the total current resulting from both sources produces the magnetic field. The polarization charge must be produced, from the Originally uncharged dielectric, by the motion of positive charges in the direction of E, and of negative charge in the opposite direction. Suppose that in equilibrium two equal charges of opposite sign lie so near together that they exert no appreciable external effect. By means of an external field these charges may be displaced relative to each other by a distance r. The charges then form a dipole of moment p = er. 272 INTRODUCTION TO THEORETICAL PHYSICS In producing such a dipole there is clearly a current dr dp e-7-=ev = -Tj- dt at If we add the dipole moments of all the polarization electrons in a unit volume we obtain the polarization vector, or the dipole moment per unit volume P = Sp, (2) and a current density due to these electrons equal to u p = ppVp = â€” â€¢ (3) In producing dielectric polarization, charges cross a surface in the body. In fact all the charges pass across the surface which originally were contained in a cylinder of base equal to the surface and length r. If rÂ» is the component of r normal to the surface, then we have as the charge passing through the end dS Ver n dS = P n dS (4) which is the surface charge appearing on dS if this is an element of the outer surface of the body. If we consider a closed surface, the enclosed volume loses the charge JjP n dS = JJJdiv P dv according to Gauss's theorem. The density of polarization electrons remaining is given by p P = â€” div P, since these have the opposite sign to those which have crossed the surface. We thus can write both polarization charge and current in terms of the polarization vector P. We have seen that the field E is that resulting from all charge, including the polarization charge. The displacement D, how- ever, is simply the field resulting from the real charge p, so that div D = 4rp. To get Maxwell's equations in terms of D, we take Eqs. (1), and make the substitutions p P = â€” div P, Up = -77) {O) dt which, as we note, obviously satisfy the continuity equation for polarization charge and current. Then we have at once, for the only two equations affected by the change, curl H = i | (E + 4xP) + ^ (6) C Of v ELECTRON THEORY AND DISPERSION 273 div (JE + 4ttP) = 4rrp. If we set D = E + 4xP, these become the ordinary Maxwell equations. 171. The Relations of P, E, and D .â€” We have seen that E measures the field of all charge, D that due to the "real" charge, and that P is the polarization per unit volume. To understand P better, we may take a unit cube of dielectric, one pair of faces being perpendicular to the field. Since the polarization surface charge is Pâ€ž, one of these faces will have a charge on its unit area of |P|, the other of â€” |P|, so that the dipole moment of the cube, coming from these two charges at unit distance apart, +tf â– t- - + + + e t _ - + + 1 * 4- - + + + * ' * â– (- - ,. * + c *~ â– !- - ' + * * LI K + - + + + * * -d a b c d Fig. 46. â€” Condenser containing dielectric. Condenser plates a and d have surface charges Â±<r. Induced surface charges are shown on faces b and c of dielectric. The force on unit charge within cavity e is E, and within cavity / is would be P. Similarly, if the volume had had length L parallel to the field, area A in the plane at right angles, the charges on the ends would be Â±PA, and the moment, remembering that these are a distance L apart, is PAL, or P times the volume, showing that the moment is proportional to the volume, so that it is really correct to regard P as the moment per unit volume. The relations of the three quantities are perhaps best under- stood from a simple illustration in the theory of the condenser. In Fig. 46 we have a condenser consisting of two parallel plates a and d with surface charges +<r, respectively. Between them there is a slab of dielectric be, with surface charges Â±P, on the faces c and 6, respectively. The field E now is determined from 274 INTRODUCTION TO THEORETICAL PHYSICS the whole charge ; that is, using our relation regarding the rela- tion of discontinuity of field to surface charge, the field within the dielectric is given by E = 4tt(o- - P). The displacement D, however, is determined only from cr, so that D = 4x0- = E + 4ttP. (7) The capacity of the condenser is given by the charge, D/4n, divided by the potential difference, E times the distance L between the plates, or is r= -â€” =â€¢â€¢ If we define the dielectric constant e as the ratio D/E, this leads correctly to the relation that the capacity of the condenser is e times the capacity of the same condenser with vacuum in place of the dielectric. Let us now consider the meaning of the field within the dielec- tric. Actually, on account of the atomic and electronic struc- ture, the field will change rapidly from point to point, so that it is not so easy as it might seem to define it. The usual method is to set up a long needle-shaped cavity e, pointing in the direction of the field. A point charge placed within the cavity would now be acted on by just the field of real and polarization charges, so that the field E is the force on unit charge in such a cavity. The necessity of choosing that particular shape of cavity is shown by considering the cavity/, which is supposed to be disk- shaped, with its flat face perpendicular to the field. This cavity will have surface charges Â±P set up on its two faces, and it is evident that the lines of force starting from the polariza- tion surface charges on plates b and c will terminate on these faces of the cavity, not crossing it at all, so that the field within it will come wholly from the real surface charges on a and d, or will be E + 4tP = D. Thus if we choose we may define E as the field in a cavity shaped' like e, in which the effect of the charges on its faces is negligible because the faces are of negligibly small area and arbitrarily far from the point where we are finding the field, while we may define D as the field in a cavity shaped like /. These definitions were originally used for the corresponding magnetic case, by Kelvin. It is interesting to notice that the fields in cavities of other shapes are different, depending on the shape of the cavity. Thus in a later section we shall see that the field in a spherical cavity is E + (47rP/3). ELECTRON THEORY AND DISPERSION 275 We notice finally that since D = eE = E + 4rP, we have Â« = 1 + (4nP/E), a constant if the polarization is proportional to the field. To compute the dielectric constant, or refractive index, we have then to find the polarization, per unit field, and we proceed to do this for gases, and later for solids. 172. Polarizability and Dielectric Constant of Gases. â€” In gases the molecules or atoms are relatively so far apart that we can neglect the interactions between them. Each molecule contains charges which can be displaced under the action of an external field, and these charges act as if they were held to posi- tions of equilibrium by restoring forces proportional to the displacement. Thus in a static case an electron e is acted on by the forces eE of the external electric field, and â€” ex the linear restoring force. The displacement is then x = (e/c)E, and the induced dipole moment ex = (e 2 /c)E. The ratio e 2 /c, giving the dipole moment set up by unit field, is called the polariza- bility, denoted by a. Thus the dipole moment per molecule is aE, and if there are N molecules per unit volume the polarization P is NaE, so that e = 1 + 4nNa. A very simple model of an atom will give us the order of magnitude of the polarizability. The atom consists of a nucleus of charge Ze, where Z is an integer, e the magnitude of the charge on the electron, surrounded by a distribution of negative charge equal to â€” Ze. In the external field the negative charge will be displaced with respect to the nucleus. The restoring force may be computed as if the negative charge filled a sphere of radius R with uniform charge density. Then the positive charge Ze, at distance r from the center, would be acted on by a force as if the negative charge within a sphere of radius r were concentrated at the center, all other negative charge being neglected. This charge would be r 3 /jÂ£ 3 times the total charge, (Ze) 2 r so that the force would be ' â€¢ The polarizability is then found to be R 3 , proportional to the volume of the molecule. 173. Dispersion in Gases. â€” We now assume a sinusoidal external field of frequency v, as in a light wave. The magnetic force on the electron on account of its motion can be neglected. In addition to the external electric force, and the elastic restoring force, we introduce a damping force proportional to velocity, to account for absorption. The equation of motion for the electron is then, for the x coordinate, 276 INTRODUCTION TO THEORETICAL PHYSICS mx + mgx + <a 2 mx = eE x Â°e i(at (8) where we have placed w = 2irv. Thus we have the problem of the damped linear oscillator in forced oscillation. We have solved this problem in Chap. IV, and can write for the steady- state solution ~E x e^ 1 -E m m (9) wo 2 â€” co 2 + iwg coo 2 â€” co 2 + iwg in complex form. This shows that the electron vibrates at the same frequency as the light wave but with an amplitude depending on the frequency and out of phase with the light wave. If we have N electrons per unit volume characterized by the constants o) k and g k (electrons of the fcth kind) we get for the dipole moment per unit volume : -2â€” -2*= m lo) k 2 â€” co 2 -H iugk k whence we get for the displacement vector D = E + 4xP = e[ 1 + 4^2 A7 e m - w 2 + icogkl k and if we introduce a "dynamic" refractive index (n â€” ik) denned by D = eE = (n - ik) 2 E, we find (n - iky = i + 4*2;?-= N k - m (io) I (ak 2 â€” co 2 + ioigk k- so that the index of refraction is a function of the frequency of the light, and different colors traveLwith different velocities. This is known as the dispersion of light. Furthermore, in general, the index of refraction is a complex quantity and, as we have seen in our discussion of electromagnetic waves in metals, this indi- cates absorption and is not surprising in view of our introduction of a damping force. In the limit of slow frequencies (long wave lengths of light where Â«. Â« co*) we may neglect the last two terms in the denominator and find ELECTRON THEORY AND DISPERSION 277 Nk- - . - 1 +-4r.2-? as the static value of the dielectric constant of insulators, agreeing with the value found in the last section^ If the frequency of the light does not lie near any of the natural frequencies of the electrons, we may neglect the frictional force and find a real index of refraction given by Ar e 2 Wh- in n = 1 + 2^-^ COk â€” CO h if we remember that the index of refraction for gases varies but slightly from unity. Thus there is no absorption and we have the case of normal dispersion. JLet us consider the index of refraction as a function of frequency of the light in the visible region of the spectrum. If the natural frequencies of the elec- trons lie in the ultra-violet (and also for the case that they lie in the infra-red) the index of refraction increases with increasing frequency, the normal behavior. In case the frequency of the light lies near a natural frequency, we obtain the phenomenon of anomalous dispersion. In this case the frictional term becomes , important and we find an absorption band in the neighborhood of co . The whole discus- sion is similar to the case of a resonant electric circuit. For simplicity let us assume only one resonant frequency. Remem- bering that for gases n is very nearly unity, we have : e 2 N n â€” ik = 1 -f 2x = . m co â€” or + iu>g and if we separate into real and imaginary parts, we obtain, eW C0 â€” 0} n = 1+ 2ttâ€” 2 " (11) m (coo 2 â€” w 2 ) 2 + cc 2 g 2 and k = 27r ^r(c 2 -co 2 ) 2 + coV' (12) n is known as the principal index of refraction and k the absorp- tion coefficient. If we plot n â€” 1 and k against the light fre- quency, we get curves of the form shown in Fig. 47. Such 278 INTRODUCTION TO THEORETICAL PHYSICS curves have already been considered in Prob. 10, Chap. IV. In the neighborhood of the absorption region we see that the index of refraction decreases with increasing frequency and this is the anomalous behavior giving rise to the term anomalous dispersion. Fig. 47. -Anomalous dispersion, showing index of refraction and absorption coefficient as function of frequency. 174. Dispersion of Solids and Liquids. â€” In the case of solids and liquids we may no longer make the approximation that the force acting on an electron is simply the electric vector of the light wave in free space, but must take into account the added force on the electron due to the polarization of the body. We can calculate this force as follows: we imagine a small sphere of radius R (with its center at the position of the electron in ques- tion) cut out of the medium. If we do this, we have induced charges on the surface of this spherical volume from which we calculate the force at the center of the sphere. We have for the surface density of induced charge on a spherical ring at 'an angle 0, a = P n = P cos 0, as in Fig. 48. The area of the ring is 2rrR sin â€¢ Rdd = 2wR 2 sin dd, so that the charge on this ring is 2tP# 2 cos sin dd. ELECTRON THEORY AND DISPERSION 279 This charge produces a field at the center of the sphere whose component parallel to E is ,-, 2ttPR 2 cos 2 6 sin dd . dE,= â€”^ so that the total charge on the sphere produces a field at the center equal to 4tP Ei = 2xP I cos 2 sin 6 dd = Jo 3 The total electric field at the center of this sphere is then 4irP E + ~ ^ ~ (13) Of course, there is still the contribution to the force by the atoms inside the little sphere we have cut out, but in an isotropic medium Fig. 48. â€” Field in spherical cavity in dielectric. this averages zero. We can now carry over our calculations for gases if we replace E by E + (4xP/3) in the expression for x. Thus we get N k â€” P = [E + â„¢ '-' m -(* + Â¥)2 wV â€” oj 2 + iiog k k and using the relations D = eE = E -f 4xP, we have B + **-â€¢-+** and we find for e e e-1 (n - iky - 1 4*->w? " k m N k - i e + 2 (n â€” t'A;) 2 + 2 3 ^JÂ« 2 ofc â€” co 2 + iwgr fc 2 ~ 3 ^lw 2 n fe - If N represents the number of atoms, then N k =f k N 280 INTRODUCTION TO THEORETICAL PHYSICS and fk gives the number of electrons of the kth kind per atom, the so called "oscillator strength," and we have (n â€” ik) 2 â€” 1 1 471--%-^ , e 2 /m 471-^-71, e* m /1>t v = T >/*"! V_i_ â€¢ ( 14 ) {n-iky + 2 N 3 ^r*co 2 OJk - co 2 + iwflf* In all cases of transparent substances, where we can neglect the damping force, and the index of refraction is real, we have for a given frequency of light : n 2 â€” 1 1 , n â€” = constant. n 2 + 2 po where p is the density of the body, obviously proportional to N. This law, known as the Lorenz-Lorentz law, is surprisingly well obeyed for many substances. Of course, in the limit of very long electromagnetic waves, and for the electrostatic case, â€” â€” - â€¢ â€” = constant, â‚¬ + 2 po giving us a relation between dielectric constant and density. If we use the expression E + (4rP/3) instead of E in the equa- tion of motion of an electron, we find similarly to our equation for gases: -^^ Nk e 2 /m ., â€ž. (n - Â»)â€¢ = 1 + 4,^ as _ ,, + ^ (15) with the only difference that instead of the natural frequency coofc of the electrons we find the apparent natural frequencies Â«>* = " 2 o* - jN^. (16) Thus we have the same type of anomalous dispersion phenomena in solids and liquids that we have in gases. 175. Dispersion of Metals. â€” In metals we picture free electrons wandering about among fixed ions, and these electrons are the conduction electrons. On the average there is no resultant force on the electrons, so that under the influence of an external field we can place the force on an electron equal to eE. If we imagine the ions as rigid structures possessing no polarizability, we then have the simplest possible picture of a metal. Thus, in contrast to the bound electrons of the previous sections, we have no restor- ing force on these electrons. We must, however, introduce a ELECTRON THEORY AND DISPERSION 281 damping force, so that steady-state motion becomes possible. Thus we have as the equation of motion of conducting electrons: mx -f- mgx = eE. (17) This equation must allow an atomistic calculation of the con- ductivity and if the external field E is constant and in the z direction, we get as the steady-state solution of this equation x = â€” / + constant. mg Thus the velocity is x = eE/mg, and if TV is the number of con- ducting electrons per unit volume, we get for the current density ,, . Ne 2 E u = Nex = mg Now by Ohm's law u = aE, we find Np 2 a = â€” (18) mg so that we are led to an expression for conductivity from an atomic point of view. It is interesting to note that dimensionally a and g are both of the dimensions of frequencies. We have already seen in Prob. 10, Chap. XXII, that the period associated with <r is the relaxation time, the time taken for any irregularity in charge distribution within the metal to decrease to 1/eth. of its value, and have seen that this frequency, for good conduc- tors, is in the ultra-violet part of the spectrum. The meaning of g is similar, as one could see by imagining an electron initially with a given velocity, and finding the time taken for its velocity to decrease to 1/eth of its initial value, the result being essentially the period associated with g. It seems very reasonable to sup- pose that approximately equal times would be required for the velocity of electrons to be damped down, and for charge irregu- larities to be ironed out, and, as a matter of fact, g is found to be of the same order of magnitude as a. One can estimate g by making a guess as to the value of N, the number of free elec- trons per unit volume, assuming, for instance, that there is one free electron per atom, and then computing g from the equation g = Ne 2 /ma. One has, then, two independent constants charac- terizing the optical behavior of a metal, so that complicated results are not surprising. In addition to this, metals like other substances contain polarizable electrons, which make additional complications. 282 INTRODUCTION TO THEORETICAL PHYSICS The formulas for the optical constants of a metal may be found simply by including the free or conduction electrons as a class of bound electrons whose binding force, and natural frequency, are zero. Thus ( ..v. 1 . 4rNe 2 /m ^i N k e 2 /m (n â€” iÂ«) = 1 H r - -. â€” â€” 4- 47T > â€” Â« 8 . Â» A n Â» _ k 2 = i _ ^ * . n * l g l + co 2 /g 2 + TO (co* 2 â€” co 2 ) 2 + (<ag*)' k j, _ <r 1 , s^N k e 2 ugh nQ s Uk ~ v 1 + Â»Â»/0 f + ^ m W ~ co 2 ) 2 + (co<7*) 2 ' Uy; where in the last two we have written Ne 2 /m as <xg. The sum- mations are over the bound electrons. We notice that as the frequency becomes low compared with a, the first term in the product nk becomes very large compared with unity, masking the effect of the bound electrons. The difference n 2 â€” k 2 does not become correspondingly large, so that in the limit, as we stated in Chap. XXII, n becomes equal to k, and both approach â– y/cr/v, neglecting co compared to g. It is easy to see that at low frequency n 2 â€” A; 2 approaches e, if in the dielectric constant we include a contribution â€”4x<r/g from the free electrons. However, it is only at low frequencies that these simplifications enter. As the frequency enters the near infra-red or visible region, it becomes of the same magnitude as a and g, so that the contribu- tions of the free electrons become complicated, and at the same time nk decreases so that the contributions of the bound electrons become important. It is thus natural that experimentally the curves of n 2 â€” k 2 and nk throughout the visible part of the spectrum are very complicated, though they can be fitted fairly accurately with formulas of the type we have derived, assuming bound as well as free electrons. In the ultra-violet, the frequency becomes too high for the free electrons to follow, the contributions of the free electrons become small compared to those of the bound electrons having resonance in that region, and a metal does not behave essentially differently .from an insulator. In conclusion, we should mention that the introduction of a frictional force proportional to the velocity of the electrons is at ELECTRON THEORY AND DISPERSION 283 best an extremely rough approximation. In metals the steady state is made possible by collisions of the electrons with the ions of the lattice, and the energy of the electrons gained from the external field is thus transmitted to the lattice, excites lattice vibrations, and appears as heat, as we shall describe more in detail later. All in all, when one considers the approximate nature of this classical electron theory, it is gratifying that it checks as well as it does with experiment and assures us that a more refined atomic picture will lead to an exact theory. Problems 1. Show that in the case of normal dispersion for the visible spectrum where there is an absorption band in the ultra-violet, the index of refraction can be written as ---* + Â£ + Â§+â– â€¢â€¢. where \ is the wave length in vacuum and A, B, C are constants. If there is also absorption in the infra-red show that the index of refraction is then given by Â»Â« =4+^+^ +....- A'\* - B'\* 2. Measurements of Hi gas give the following values of the index of refraction : X in A. (n - 1 5,462.260 1,396 4,078.991 1,426 3,342.438 1,461 2,894.452 1,499 2,535.560 1,547 2,302.870 1,594 1,935.846 1,718 1,854.637 1,760 Using the expression in Prob. 1 for n 2 in reciprocal powers of X, calculate the best values of A, B, and C. If the measurements are made at room temperature and atmospheric pressure, calculate the resonant frequency wo and wave length from these constants. 3. Prove that in the case of anomalous dispersion for gases the maximum and minimum values of n occur at the positions where the absorption coeffi- cient reaches half its maximum value. Show that the half width of the absorption band equals the damping constant divided by the mass of an electron. Assume g/o>o < < 1. 4. For the D line of sodium the following values of the constants in the dispersion formula are found: wo = 3 X 10 1S ; g = 2 X 10 10 ; 4irNe*/m = 10 23 . Plot the index of refraction n and the absorption coefficient A; as a function Of the frequency of light. Find the maximum and minimum values of the 284 INTRODUCTION TO THEORETICAL PHYSICS index of refraction n. Find the maximum value of the absorption coefficient k and the half width of the absorption band in Angstrom units. 5. Show that for gases the Lorenz-Lorentz law takes the approximate 2 n â€” 1 form 5 â€¢ = constant. The following measurements have been made o po on air (p given in arbitrary units), t po n 1.00 1.0002929 14.84 1.004338 42.13 1.01241 69.24 1.02044 96.16 1.02842 123.04 1.03633 149.53 1.04421 176.27 1.05213 Calculate 5 â€¢ and â€” for each of these measurements and com- 3 po n 2 + 2 po pare the constancy of the results (calculate to four significant figures). 6. The indices of refraction for the sodium D line, and densities in grams' per cubic centimeter of some liquids at 15Â°C. are Po Water Carbon bisulphide. Ethyl ether 0.9991 1.2709 0..7200 1.3337 1.6320 1.3558 Calculate the indices of refraction for the vapors at 0Â°C. and 760 mm. pres- sure. The observed values for the vapors are 1.000250, 1.00148, and 1,00152, respectively. 7. -The quantity , , , , is called the refractivity of a substance if m (n* -\- z)p denotes its mass. Prove that the refractivities of mixtures of substances equal the sum of the refractivities of the constituents. (Neglect damping forces from the start.) 8. Show that the molecular refractivity of a compound, defined as â– â€¢ n 2 -\- 2 M -'-, where M is the molecular weight, is equal to the sum of the atomic Po refractivities of the atoms of which the compound is formed. (Neglect damping forces.) 9. Prove that the apparent natural frequencies Â«*, in the equation for the index of refraction for a solid or a liquid, are related to the natural fre- quencies mo for the electrons in isolated atoms by the equation _ â€ž , 4tt Nke 2 cojr = Wao â€” -5- 6 m ELECTRON THEORY AND DISPERSION 285 10. For the following gases we have the following values of (n â€” 1)â€ž extrapolated to long wave lengths: Gas (Â» - 1), â€¢ 10 6 H 2 136.35 N 2 294.5 2 265.3 Calculate the values of (n â€” I), for the following gases: H 2 0, NH 3 , NO, N 2 4 , 3 . The measured values are 245.6, 364.6, 288.2, 496.5, 483.6, all times 10 6 . Find the percentage discrepancy between the calculated and observed values. CHAPTER XXV SPHERICAL ELECTROMAGNETIC WAVES Suppose that we have an electrical charge oscillating back and forth sinusoidally with the time. This charge will send out a spherical electromagnetic wave, radiating in all directions. There are several physical problems connected with such a wave. First, the phenomenon may be on a large scale, as in a radio antenna. Radiation from a vertical antenna, as a matter of fact, can be approximately treated by replacing the antenna by such an oscillating charge. But also on a smaller scale we can treat the radiation of short electromagnetic waves, or in other words light, from an atom which contains oscillating electrons. The electrons may have been set in motion by heat or bombard- ment, in which case we have the treatment of the emission of light from a luminous body; or they may be in forced motion under the action of another light wave, as in the case of the scat- tering of light. As a first step in the discussion of these problems, we consider spherical solutions of the wave equation, then passing on to the special case of electromagnetic fields. 176. Spherical Solutions of the Wave Equation. â€” The wave equation can be solved by separation in spherical coordinates, as we have seen in Probs. 6, 7, and 8 of Chap. XV, and in Sec. 130, Chap. XVIII. The solutions are of the form e Â±iut sin ra<Â£ Pi m (cos 8)R(r), where R satisfies a differential equation which, by a slight transformation of the results quoted above, can be written d 2 (rR) aS _ 1(1+ 1) Â«)2 *.2 rR = 0. (1) dr 2 The solution of the equation for R was shown in the problem quoted above to be expressible in terms of BessePs functions, of half integral order, divided by y/r. It proves to be possible, however, to express these functions in an alternative manner in terms of exponential or trigonometric functions, and we shall use that more elementary method in the present chapter. Further, we shall find that we have to consider only the very simplest types of spherical waves, for the purposes we are interested in. 280 SPHERICAL ELECTROMAGNETIC WAVES 287 The simplest solution in spherical coordinates is the one inde- pendent of angle, for which I = 0. In this case, solving Eq. (1), . ucr we have rR = e Â± â€¢ , giving as the solution of the wave equation Â» o+ia(tÂ±r/v) -I the functions â– > reducing to - for the static case where r r (a = 0. This represents a sinusoidal wave, traveling out along r (if we have t - r/v) or in along r (if we have t + r/v), with a velocity v, and with an amplitude which decreases as 1/r. This decrease of amplitude is necessary if equal amounts of energy are to flow across all concentric spheres, since the intensity, proportional to the square of the amplitude, must be proportional to 1/r 2 so that its product with the area of the sphere may be constant. A more general spherical wave can be obtained if we are not limited to sinusoidal vibrations. Thus the wave equation in spherical coordinates, neglecting terms in 6 and <t> which are zero for solutions independent of angle, is d\ru) _ 1 d 2 (ru) = . dr 2 v 2 dt 2 ' . U) â– which has a general solution u = - \f(t - -J + g(t + - ) , as can be proved by direct substitution, where /, g, are arbitrary functions. This represents one wave traveling out from the center, another traveling in, with arbitrary wave form, and corresponds to the solution At - lx + m V + nz \ for the wave equation in rectangular. coordinates, expressing a plane wave of arbitrary wave form. More complicated waves are those which are not spherically symmetrical, but instead depend on the angles. We have seen in Sec. 140, Chap. XIX, that if 1/r is a solution of Laplace's equation, then ^-( -J is a solution, where n is an arbitrary direc- tion. This solution represents the potential of unit dipole, the differentiation giving the difference of the potentials of two oppo- site charges infinitesimally close together. If 6 is the angle between n and the direction in which we are finding the potential, , d/l\ d/l\ 1 we nave ~{-\ = â€” ( -\ cos 6 = -- cos 6. This is a solution u = 288 INTRODUCTION TO THEORETICAL PHYSICS of Laplace's equation, and in terms of our standard solution in spherical coordinates it is the solution corresponding to Z = 1, m = 0. The function of r is r _(m) , in accordance with the results of Sec. 130. As a matter of fact, it can be shown that all the solutions of Laplace's equation, and therefore all the spherical harmonics, can be derived in this way by differentiations of the simple solution 1/r with respect to different directions. In a similar way, if we are given the solution â€¢ of the wave equation, we may differentiate with respect to n and again obtain a solution. This gives dj e ^-ryv)\ cog / 1 _ tÂ«\ fc i.(*-r/.) cos 0. (3) dry r J \ r 2 rvj This is the solution corresponding to I = 1, as before, and the function of r in Eq. (3) is the alternative way mentioned above of writing the Bessel's function obtained by direct solution of Eq. (1). For values of r small compared with a wave length, the term 1/r 2 is large compared with w/rv, remembering that <a/v = 2r/X. Thus at short distances the second term iD Eq. (3) can be neglected, and the function falls off as 1/r 2 , as in the static case. Further, at short distances, the quantity r/v in the exponent represents a time lag which is only a short fraction of the period of oscillation, so that we may neglect it, obtaining â€” 2 cos e*"', the potential we should expect from a dipole of variable moment e iut from a quasi-stationary argument in which we supposed that the variation of the moment was so slow that we could treat the dipole instantaneously as if it were constant. On the other hand, at large distances, the other term predomi- nates, and the solution of the wave equation falls off as 1/r. This part of the field is called the radiation field, and we see from it that this solution for a dipole persists to large r's just as does the spherically symmetrical solution, the intensity falling off as 1/r 2 , and the field representing a wave traveling out with velocity v. This radiation field is a characteristic feature of solutions of the wave equation, and is not present in the limiting case of Laplace's equation. 177. Scalar Potential for Oscillating Dipole. â€” Let a charge e oscillate up and down along the z axis, its displacement being given by the real part of Ce iut . We shall assume an equal SPHERICAL ELECTROMAGNETIC WAVES 289 and opposite charge to be always at the origin, so that the whole thing is electrically neutral, and constitutes a dipole of moment eCe io>t = Me ^t^ We wigh tQ find itg fieM We ghaU d() thifl by finding the scalar and vector potentials, first computing these directly, then in a later section showing that they can be easily obtained from another vector, called the. Hertz vector. The scalar and vector potentials are solutions of D'Alembert's equations, in which the charge and current densities, respectively, appear on the right sides of the equations. These are different from zero only at the dipole, which is assumed to be of infinitesimal dimensions, so that, except at the origin, the potentials satisfy the wave equation. We must then look for solutions of the wave equation satisfying the one condition that they reduce to the correct value at the origin, or at the dipole itself. The solution (3) is a function reducing to the scalar potential of a dipole in the limiting case of a static field, and we have seen that it also reduces to the value we should expect for points close to the dipole, even in a variable field. It corresponds to the scalar potential of a unit oscillating dipole. We expect, therefore, that for the dipole of moment M e iwt the scalar potential will be d /f>~ i<ar / c \ * = ~ M dl\~r~ ) cos deiat ' < 4 ) where now we write the velocity equal to c, for the case of light waves. 178. Vector Potential.â€” Next we may find the vector potential, using two facts: first, div A + (l/c)d<t>/dt = 0; second, since the current is always along the axis, the vector potential must also be in this direction. If now A is along the z axis, we easily have A r = A cos d, A = -A sin 6, A+ = 0, if A is the magnitude of the vector. Let us suppose tentatively that A is a function only of r (being prepared to reject this if it does not work). Then, using the equations for the divergence in spherical coordi- nates, we have div A = - 2l r(r 2 A cos 0) -\ tâ€”JLf-A sin 2 d) r 2 dr v ' r sin 6 dd K J Also dA â€žâ„¢ a . 2A a s 2A cos d dA â€” cos e + â€” cos e _ = _ cos e. 1 d<j> _ io)<f> c dt c 290 INTRODUCTION TO THEORETICAL PHYSICS Hence we have dA 8 - ^M-^l?â€”-] cos Oe*' = 0. c ar\ r J This can be satisfied by , COS M â€” \ dr A = â€”M- c r We note that this, which represents A z , satisfies the wave equa- tion, as, of course, it must. Then we have A r = â€” M cos de wt , c r A e = M sin 6e iat , c r A* = 0. (5) 179. The Fields. â€” Let us first find the magnetic field H = curl A. We at once have H r = He = 0, H+ = -4-(-â€”Me-^ c sin de'A- - -^ V dr\ c J r 38 â€” M- cos deâ„¢ ) = M=* sin Oe^H - 1 - ^ V (6) c r J c L r y nor J From this we see that the magnetic field always goes in circles around the axis, as we should expect from the resemblance of the problem to that of a linear current. At large distances, the second term vanishes compared with the first, leaving H+ = 9 sin de wt . (7) 1 dA Next we find the electric field E = â€” grad <j> â€” â€¢ We have c ot (9 T rl /pâ€”i<Â»r/c\ ~] .,2 pâ€”iur/c E r = i-\ M^-l- ) cos Be*"' + M% cos deâ„¢ 1 ar\ dr\ r I J c 2 r e -icor/ C /2iu 2\ = M cos Be^l â– â€” â€¢ + -5 V Eg = ~-M^-\ sin 0e iut - â€” 9 M- â€” ^ sin Beâ„¢ r dr\ r \ c z r = -M-s sin Be^H 1 + â€” =^ V c z r \ i(ar wV/ #* = 0. (8) SPHERICAL ELECTROMAGNETIC WAVES 291 From these results we see that at large r's (large compared with a wave length), E and H become equal to each other, at right angles, and at right angles to the direction of propagation, CO 2 oâ€”iwr/c just as with a plane wave. They equal M -j sinfle^, * C T the amplitudes, apart from the sinusoidal parts, being M% â€” â€” c 2 r On the other hand, at small distances, the electrical field approaches that calculated for the dipole by electrostatics, falling off as 1/r 3 , while the magnetic field, 90 deg. out of phase with the moment, or in phase with the current, is proportional to 1/r 2 . At intermediate distances, the transition from the one situation to the other is of a complicated form. For discus- sion of radiation fields, it is only the result at large r that inter- ests us. The law giving the electric field at large r's can be put in an interesting form. First we take the acceleration, â€”co 2 Me ia,t . We imagine this to be a vector along the axis. Now if we wish the field at a certain point, we take a plane normal to r passing through this point, and project the acceleration vector on that plane, using, however, not the instantaneous value but the value at the previous time (t â€” r/c). The result, dividing by re 2 , gives ^2 gi&)(<â€” r/c) â€” M-Â£ â€” sin 0, the correct value for the field. We see from this that the dipole sends out maximum radiation to the sides, none along the line of its motion. There is an interesting extension of this to the case of a particle vibrating, not in a line, but in an arbitrary ellipse (the most general sinusoidal motion). To get the field, we again project the acceleration vector, which is proportional to the displacement, on the normal plane. Thus the vector E in general traces out an ellipse, and the wave is elliptically polarized. An interesting case is that in which the charge rotates in a circle. Then at a point along the axis, the resulting light is circularly polarized; at a point in the plane of the circle, it is linearly polarized; between, it is elliptically polarized. 180. The Hertz Vector. â€” There is another interesting way of considering the dipole solution, due to Hertz. The scalar and vector potentials satisfy the relation divA+i^ = c dt 0. 292 INTRODUCTION TO THEORETICAL PHYSICS It would be convenient to have only one quantity from which the electromagnetic field can be derived. It is possible to find such a quantity, a vector n, called the Hertz vector. The above relation can be satisfied identically if we place i an A ~ c dt tf> = -divll. (9) This vector II satisfies the wave equation with no subsidiary conditions such as are imposed on the vector and scalar potentials. If any solution II of the wave equation is found, then this repre- sents an electromagnetic field, and the electric and magnetic fields are given by , ,. __ i a 2 n E = grad div II - - 2 -^ H = -cml^- (10) c at It turns out that the Hertz vector representing the field of an oscillating dipole is simply a spherically symmetrical solution of the wave equation. The correct solution, representing an outgoing wave, is n = A â€” il (ii) r so that, if p represents the dipole moment of our oscillating charge (including the time variation) pointing along the z axis, it is easy to show that the vector and scalar potentials derived from this Hertz vector are just those derived in the previous sections. For example, the vector potential 1 dp(t â€” r/c) and if the dipole moment, cr dt p - Me io,( - t ~ r/c) , t- = itaMe^ 1 -*'* dt giving for A the value we have already found cr SPHERICAL ELECTROMAGNETIC WAVES 293 In finding the vector potential, we must remember that n is a vector pointing along the z axis and has the components: n r = n cos 0; lie = â€” II sin 0; II,, = 0. If we take the divergence of this vector with a negative sign we are led to our first result for the scalar potential. The fields E and H must, of course, be the same as those discussed, since the vector and scalar potentials are identical. For convenience we write the expressions for them in vector notation. From the above equations relating E and H to n, we have # = grad div f*^^ and 1 p"(t - r/c) â€¢.2 H = - curl c tSL^iM] (12) where the dashes denote differentiation with respect to t. These expressions lead to the same values we have been using when p varies sinusoidally with its argument. They are somewhat more general since they hold for any periodic motion of the dipole. 181. Intensity of Radiation from a Dipole. â€” We can easily compute Poynting's vector, and find the total rate at which the dipole is radiating energy. Poynting's vector is evidently cM 2 o> 4 cos 2 o(t â€” r/c) . , â€ž I" ~i 2 S111 d > <br c 4 r 2 ' the time average being M W sin 2 8wc 3 r 2 Let us now integrate over the surface of a sphere of radius r, to get the total radiation. The element of area is r 2 sin ddd<f>, so that the result is ^M 2 C 2v C v . z MW 16ttWV 8^ Jo Jo Sm $ dd d * = ~W - â€”8?-' (13) if v = co/2x is the frequency. This is a well-known formula for the radiation from a dipole. The two essential features are that the radiation is proportional to the square of the amplitude of the dipole, and to the fourth power of the frequency. 182. Scattering of Light. â€” In addition to direct radiation, it is important to consider the process of scattering of light. Sup- 294 INTRODUCTION TO THEORETICAL PHYSICS pose that a wave, for example a plane wave, falls on a dipole of the sort we have considered. Let the dipole have an equation of motion mix + o}q 2 x) = eE, if m is the mass, e the charge, of the vibrating particle, E the external field, and x the displacement. Then, letting E = Eae*"*, we have ex, the moment, equal to E e l m co This is the oscillating dipole moment produced by the field. Then the dipoles set into motion by the wave will emit light, which is scattered. The rate of emission by a single dipole is co 4 (e 2 E V â– >} \m coo 2 â€” <o 2 / Often the scattering is measured by the amount of light scattered per cubic centimeter of material, divided by the intensity of the incident light. The latter is (c/ir)(E X H), its mean value being cE 2 /Sr. Further, the amount scattered per cubic centi- meter is N times that scattered by a single dipole, if there are N dipoles and they scatter independently (as the molecules of a gas do). Hence for the scattering we have 8xNe i 1 3c 4 w 2 [(;)â€¢-']' (14) There are three important special cases of this scattering formula : (a) The Rayleigh Scattering Formula. â€” This is what we have in the case where co is small compared with w . Since for ordinary atoms co is a frequency in the ultra-violet, we have this condition in the visible range of the spectrum. Then we may neglect 1 compared with (co /co) 2 , obtaining for the scattering 87riVe 4 fa) 4 ,- Â»v 3c 4 mW* K } The scattering is here proportional to co 4 , or to 1/X 4 , where X is the wave length. This proportionality to the inverse fourth power of the wave length means that the short blue and violet waves will be scattered much more than the long red ones. An SPHERICAL ELECTROMAGNETIC WAVES 295 example is the scattering of light by the sky. The air molecules scatter, and on account of the law they scatter much more blue light, resulting in the blue color. The transmitted light thus has the blue removed and looks red, explaining the color near the sun at sunset. (b) The Thomson Scattering Formula. â€” In the other limiting case of x-rays, when the frequency is large compared with co , the scattering becomes 3c 4 w 2 UD; This formula gives a scattering independent of the wave length, and is very important in discussing a>ray scattering by substances. (c) Resonant Scattering. â€” If to is nearly equal to o> , it is evident that the denominator can become very small (of course, if we consider damping, it will not vanish), resulting in a very large scattering. This phenomenon can be much more conspicuous than the two other cases. Thus a bulb filled with sodium vapor, which has a natural frequency in the visible region, illuminated with light of this color, will scatter so much light that it appears luminous. This phenomenon is called resonance scattering. 183. Polarization of Scattered Light. â€” We observe that, if the incident light is plane polarized, the dipoles will all vibrate along the direction of its electric vector. Thus there will be no intensity in the scattered light along this direction. The scat- tered light will have a maximum intensity at right angles, and it will be plane polarized. It was by experiments based on these facts that the polarization of x-rays was first found. 184. Coherence and Incoherence of Light. â€” In the previous paragraphs we calculated the scattering by N molecules which scatter independently by adding the intensities of the scattered radiation from each. The justification for this requires closer consideration. Since the Maxwell equations are linear the field vectors E and H satisfy the superposition principle, so that we should expect the total amplitude to be the sum of the amplitudes in the various waves, in which case the total intensity, being the square of the amplitude, would certainly not be the sum of the separate intensities. The key to this situation is found in the relations between the phases of the various waves which we are adding: if they are all in the same phase, they are said to be coherent, and the amplitudes add, while if they are in phases. 296 INTRODUCTION TO THEORETICAL PHYSICS having random relations to each other they are incoherent and the intensities add. To be more precise, let us consider the sum of a number of sinusoidal waves, all of the same frequency, but of different amplitude and phase: Y^Ucos (cd - a k ) = (2)A*cosa*) cosco* + (^A k sma k ) sinco*. k k * If all the phases should be the same, say a k = 0, then the amplitudes of the cosine and sine terms will be ^A k and 0, k respectively, so that the amplitudes add, and the intensity is proportional to (^A fc ) 2 , or if, for instance, there are N terms of k equal amplitude, proportional to N 2 times the intensity of a single wave. On the other hand, the as may be completely independent of each other, meaning that each a is equally likely to have any value between and 2w, independent of the others. Then we can see that y.A k cos a k will be far less than ^A k , since k * we shall have just about as many terms with positive values of cos a k as with negative, and the terms will just about cancel. The cancellation will not be complete, however, as we see if we compute the squares of the summations, which we must add to get the intensity. The square of the first summation, for instance, is (y\A k cos a fc V = 2% 2 cos2 ak + XX Ak Al C0S ak cos au k k tei We must find the average of this, taking the as as independent. That is, we must perform the operation of integrating each a from to 2tt and dividing by 2tt. When we do this, the terms cos 2 a k average to )4, while the products of two independent a's average to zero, leaving ^^i^ 2 - The Â° ther summation k gives an equal term, so that we find that the mean square ampli- tude, or mean intensity, averaged over phases, is the sum of the individual intensities. This is the state of complete incoherence, in which for N waves the intensity is N times the intensity of a single wave, rather than N 2 as for the coherent case. The cancellation of waves, then, while not complete, is more and more SPHERICAL ELECTROMAGNETIC WAVES 297 perfect as N increases, for N becomes a smaller and smaller frac- tion of N 2 as N increases. We can now apply the idea of coherence to the scattering of light from a gas. The phase of the wave at a point P, scattered by an atom at a (Fig. 49), depends on the total path the light has traveled from the source to a, and from a to P. Since the mole- cules of a gas have no fixed positions with respect to each other, these paths are in a random relation to each other, the phases are incoherent, and we are justified in adding intensities. Such a procedure would not be allowed for example in discussing the scattering of x-rays by crystals, where the various atoms are in fixed lattice positions. Indeed, here we do get interference, and it is just by studying the interference patterns so obtained that (b) Fig. 49. â€” Scattering from atoms. (a) At right angles to the incident beam, where the paths of the scattered light from the atoms a, b, c are of different and random lengths, so that there is no regular interference, and we add intensities. (6) Scattering straight ahead, where the paths are approximately equal, and the beams interfere to produce the refracted beam. we obtain our information about the lattice structure of crystals. Neither would the procedure be allowed in discussing the scatter- ing from a gas in the same direction as the incident radiation, as in (6). For then the paths of the beams scattered from the various atoms are approximately equal, the waves are in phase, and they produce a resultant field at P proportional to the ampli- tude, rather than the intensity, of the incident wave. This scattered field can be shown to interfere with the incident wave in such a way that the resultant produces the refracted wave. The close relation of our scattering formulas to the formulas for the index of refraction, therefore, becomes clear, and it is evident that our two problems of refraction and scattering, though we have treated them separately, are really parts of the same sub- ject. The scattering straight ahead produces refraction, and does not depend on the exact placing of the molecules. Scatter- ing to the sides, on the other hand, does not occur unless the where 298 INTRODUCTION TO THEORETICAL PHYSICS molecules have a random arrangement, and then the intensity, not the amplitude, is proportional to the number of molecules. 185. Coherence and the Spectrum. â€” The amplitude of a wave, as a function of time, is never exactly sinusoidal, but is really a much more complicated function. It is often desirable, how- ever, to resolve such a function into a spectrum; that is, write it as a sum of sinusoidal waves of different frequency. This can be conveniently done by Fourier series. To do this, we take a Fourier series with an extremely long period T, so long that all the phenomena we are interested in take place in a time short compared with T, so that we are not bothered by the periodicity of the series. Then, if our function is /(f), we have /(f) = ^ (An COS 0) n t + B n Sm Q} n t), n o C T/2 2 C T/2 A n = w\ f(t) cos ajdt, B n = ^ f(t ) sin <aâ€žt dt , TJ-T/2 J-J-T/2 Wn = Y 11 - ^ This gives an analysis into an infinite number of sine waves, with frequencies spaced very close together (on account of the very small size of 2*/T). No actual, physical wave is then perfectly sinusoidal, in the sense of having but one term in this expansion with an amplitude different from zero. We shall show in a prob- lem that even a perfectly sinusoidal wave which persists for only a finite length of time will have appreciable amplitudes for all those frequencies within a range Aco, equal in order of magnitude to the reciprocal of the time during which the wave persists, so that a sine wave of long lifetime will correspond to a sharp line in the spectrum, while a rapidly interrupted wave will give a broad line. This is observed experimentally in the fact that increasing the pressure of a gas, thereby making collisions more frequent and interrupting the radiating of the atoms, broadens the spectral lines. The intensity is proportional to p(t), or to the square of the summation over frequencies. Just as before, this square consists of terms like A n 2 cos 2 <oâ€žf, and cross terms like A n A m cos uj coSiCdJ . Instantaneously none of these terms are necessarily zero. But if we average over time, the terms of the first sort average to A n 2 /2, while those of the second sort average to zero. The final SPHERICAL ELECTROMAGNETIC WAVES 299 result, then, is that the time average intensity is the sum of the intensities of the various frequencies: p(t) = s^*( A n 2 +B n 2 ) n We are justified in considering the terms connected with a given n to be the intensity of light of that particular frequency in the spectrum, so that we have the theoretical method of determining the spectral analysis of any disturbance. And we see that the following statement is true: on a time average, sinusoidal waves of different frequencies are always incoherent, and never interfere. 186. Coherence of Different Sources. â€” It is known experi- mentally that light from two different sources never interferes; to get interference we must take light from a single source, split it into two beams, and allow these beams to recombine. If we regarded the sources as being monochromatic, it would be hard to see why this should be, for the amplitudes of two waves of the same frequency should add, rather than the intensities, and this is the essence of interference. But when we observe that each source really is represented by a Fourier series, the situation becomes plain. For two sources are always so different that their Fourier series will be entirely different. If we analyze both of them, the phase of the radiation of frequency <o n from one will be entirely independent of the phase of the corresponding fre- quency from the other. Thus if we add the disturbances, square, and average over this random relation between the phases of the two sources, the cross terms will cancel, and the intensities add. The randomness comes in this case, not in adding a great many terms of the same frequency, but in combining the terms of different frequencies, which are related in entirely independent ways in the two sources. Problems 1. Discuss the weakening of sunlight on account of scattering, as the light passes through the atmosphere. Assume that the molecules of the atmos- phere have a natural frequency at 1,800 A. (where absorption is observed). Let each molecule contain an electron of this frequency. Assume that the number of molecules is such as to give the normal barometric pressure. Find the fractional weakening of a beam due to scattering in passing through a sheet of thickness ds, and from this set up the differential equation for intensity as a function of the distance. Solve for the ratio of intensity to the intensity before striking the atmosphere, for the sun shining straight down, and for it shining at an angle of incidence of 60 deg. Constants: 300 INTRODUCTION TO THEORETICAL PHYSICS e = 4.774 X 10~ 10 e.s.u., m = 9.00 X 10~ 28 gm., number of molecules in 1 gm.-mol = 6.06 X 10 23 . 2. A vibrating dipole radiates energy, and therefore its own energy- decreases. Noting that the rate of radiation is proportional to the energy, set up the differential equation for the energy of the dipole as a function of the time. Find how long it takes the dipole to lose half its energy. Work out numerical values for the sort of dipole considered in Prob. 1. 3. Using the results of Prob. 2, find the equivalent damping term which would make the dipole lose energy at the same rate as the radiation. This damping is called the radiation resistance. 4. Show that the values for E and H, which we have found, satisfy Max- well's equations, by direct calculations in polar coordinates. 5. Derive the expressions for E and H in terms of the Hertz vector n from the equations defining II. * â– 6. Show that the fields E and H in terms of p(t - r/c) and its time deriva- tives reduce to the values in terms of the dipole moment M. 7. Show that near an oscillating dipole the magnetic field is given by H = ~Jr X p'(t)} and thus can be derived from the Biot-Savart law when we place p'(t) = I(t)ds, where I(t) is the current and ds an element of length in the direction of the dipole. 8. Show from the Hertz vector for the dipole case, that at large distances from the dipole, and B--M. rxp "( t -i)} 9. Suppose we have an alternating current of maximum value / (meas- ured in e.m.u.) in a vertical antenna of length I. Treating this as a dipole, show that the total radiation is 4tt 2 c l 2 P 3 X 2 Show that the equivalent resistance necessary to produce the same power loss (the radiation resistance) -is R = 80*-^ if R is measured in ohms, and if we place c = 3 X 10 10 cm. per second. 10. Find the spectrum of a disturbance which is zero up to t = 0, is sinusoidal until t = T , then is zero permanently. (Hint: make the period T of the Fourier series indefinitely large compared with To.) 11. Find the spectrum of a disturbance which starts at t = 0, and is a sinusoidal damped wave after that. Show that the curve for intensity as a SPHERICAL ELECTROMAGNETIC WAVES 301 function of frequency has the same form as a resonance curve, in general, and that its breadth is connected with the logarithmic decrement in the same way. This illustrates an important principle : the emission and absorp- tion spectrum of the same substance are essentially equivalent. The resonance curve represents the absorption curve, on account of the relation of forced oscillators and dispersion, while the damped wave is the emission. (Hint: make the period T indefinitely large compared with the time taken for the oscillation to fall to 1/e th of its value.) CHAPTER XXVI HUYGENS' PRINCIPLE AND GREEN'S THEOREM Huygens' principle is a well-known elementary method for treating the propagation of waves, and in this chapter we shall consider its mathematical background, showing its close connec- tion with Green's theorem. The method is this: From each point of a given wave front, at t = 0, we assume that spherical wavelets start out. At time t, each wavelet will have a radius ct, and the envelope of these wavelets will form a new surface, which according to Huygens is simply the resulting wave front at this later time t. Thus, if the original wave front was a plane, it is easy to see that the final one will be a plane distant by the amount ct, while, if it is a sphere, the final wave front will be a concentric sphere whose radius is larger by ct. In either case this construction gives us the correct answer, agreeing with the more usual methods of computation. The one diffi- culty is that our construction would give a wave traveling back- ward, as well as one traveling forward; the solution of this difficulty appears when we use the methods of this chapter. We may look at our process in a slightly different way, not used by Huygens, but developed later when the interference of light was being worked out. Suppose that, instead of taking the envelope of all the spherical wavelets, we consider that each of these wavelets has a certain amplitude, consisting of a sinu- soidal vibration. We then add these vibrations, just as if the wavelets were being sent out by interfering sources of light, and the resulting amplitude is taken to be that in the actual wave. This process can be shown to lead to essentially the same result, and it is this which can be justified theoretically. As a further generalization, it is not necessary to take the original surface to be a wave front; it can be any surface, so long as we allow the scattered wavelets to have the suitable phase and amplitude. Our final result, then, is this: The disturbance at a point P of a wave field may be obtained by taking an arbitrary surface, 302 HUYGENS' PRINCIPLE AND GREEN'S THEOREM 303 and performing an integration over this surface. The contribu- tion of a small element of area dS of this surface equals the amplitude at P of a spherical wave starting from dS at such a time that it reaches P at time t. That is, if the distance from dS to P is denoted by r, this wave is of the form ^ ~ r ' c K r Now the contribution, for a given wavelet, must surely be pro- portional to the disturbance at dS, which we may call / (a func- tion of time and position), and to dS. Hence we have something C Cfff _ r / c ) like I I â€” dS for the final result. We are thus led to a formula of this sort: / (at a point P) = constant X I IfSjZlM d$, where the surface integral is over a surface surrounding P. This suggests the solution of Laplace's equation by Green's method, where we had the value of a function <f> at an interior point of a region where v 2 <Â£ was zero as a surface integral over the boundary. As a matter of fact, an analogue to Green's theorem is the correct statement of Huygens' principle, and replaces the formula which we have derived intuitively above, and which is not just correct. 187. The Retarded Potentialsâ€” In Chap. XXI, we have introduced scalar and vector potentials <f> and A, giving the electric and magnetic fields by the relations E= -grad - \ d A H = curl A. For these potentials we found the equations or D'Alembert's equation. We ask first how to get a solution of D'Alembert's equation analogous to the simple solution >^ DEPARTMENT OF CHEMISTRY LIVERPOOL COLLEGE OF TECHNOLOGY 304 INTRODUCTION TO THEORETICAL PHYSICS of Poisson's equation. â€¢ We shall not carry through the proof of the solution, for that is rather complicated. But the essence of Poisson's equation is that we divide up all space into volume elements dv, and that pdv/r is the potential of the point charge pdv at a distance r. This potential, of course, is a solution of Laplace's equation, as is 1/r, at all points except for r = 0, where the charge is located. In a similar way, to solve D'Alembert's equation, we divide up our charge into small elements, and write the potential as the sum of the separate potentials of these small charges. The separate potentials must now be, except at r = 0, solutions of the wave equation. This means that, since any change of the charge will be propagated outward with the velocity c, the potential at a given point of space resulting from a particular charge cannot be derived from the instantaneous value of the charge, but must be determined, instead, by what the charge was doing at a previous instant, earlier by the time r/c required for the light to travel out from the charge to the point we are interested in. In other words, if p(x, y, z, t) is the charge density at x, y, z at the time t, and r is the distance from x, y, z to x', y',z', where we are finding the field, we shall expect the potential of the charge in dv to be p(x, y, z, t - r/c)dv ^ r and for the whole potential we shall have . _ i f f CvÂ±Z?mLÂ£. (3) This solution is, as a matter of fact, correct. We have already seen that *" ~ r ' â– is a solution of the wave equation, where r / is any function, so that the integrand actually satisfies the wave equation, as in the earlier case 1/r satisfied Laplace's equation. The potential <p determined by this equation is called a retarded potential, since any change in the charge is not instantaneously observable in the potential at a distant point, but its effect is retarded on account of the finite velocity of HUYGENS' PRINCIPLE AND GREEN'S THEOREM^ 305 light. The solution for the vector potential is determined in an analogous manner. 188. Mathematical Formulation of Huygens' Principle. â€” In discussing the application of Green's theorem to the solution of Poisson's equations in a finite region of space, we have proved that the result of the last paragraph being the special case where the region of integration is infinite and the surface integral drops out. We now wish to find an analogous theorem for use with D'Alembert's equation. Here again we shall not give a real derivation, for this is very complicated, but shall merely describe the formula which results, and show that it is plausible. We have already discussed the volume integral. In the surface integral, the first term gave the potential of a double layer of strength (t>/4r, the second the potential of a surface charge of magnitude j- -râ€” Each of the terms, <j> . and - ~, is a solution 4t an an r dn of Laplace's equation since it represents the potential of certain charges. In our case of the wave equation, the formula has two corre- sponding terms: one giving the potential of a double layer, the other of a surface charge. But now the charges change with time, so that we must use solutions of the wave equation in the integral. We have already seen that the solution of the wave equation corresponding to - is â€” ^-J; hence we expect the second term to be replaced by â€” ( ^r ) , where this means r\dn/(t-r/c) that the partial derivative, which is now a function of time as well as of position on the surface, is to be computed, not at t, r d(l/r) but at i Similarly corresponding to â– \ , the differ- c an ence of the potentials of two equal and opposite point charges at neighboring points of space, we have â€” yâ€” â€” f. Remem- bering that in differentiating with respect to n we must regard r as a variable each time it occurs, this is 306 INTRODUCTION TO THEORETICAL PHYSICS /(<-3^ + ;/I'H)]= _ cos {n, r) f f(t - r/c) 1 df(t - r/c) \ r \ r c dt j where in the last term we have used the relation df(t - r/c) = df(t - r/c) d(t - r/c) = df(t - r/c) / 1 jh\ dn d(t â€” r/c) dn dt \ c dn) 1 ( n df(t - r/c) = â€” cos (n, r) -^ â€” ^-. c dt We should, therefore, expect to have dv -r/c This, as a matter of fact, is the correct formula. The first term represents the potential due to all the charge within the volume ; if there are no sources of light within this volume, the volume integral is then zero, and that is the usual case with optical applications. The surface integral represents the remaining potential as arising from a distribution of charge and double distribution about the surface, each surface element sending out a wavelet which on closer examination proves to be the Huygens' wavelet we are interested in. Thus, starting from Green's theorem and D'Alembert's equation, we have arrived at a mathematical formulation of Huygens' theorem. To give a suggestion of the rigorous proof of this formula, we could proceed as follows: First, we notice that <j> defined by this integral satisfies the wave equation; for since each term of the integrand separately is a solution, the sum must also be. Now it follows from this, although we have not proved it, that if the solution reduces to the correct boundary values at all points of the boundary, the solution must be the correct one, the reason being essentially that the boundary values determine a solution uniquely, so that, if we have one solution of the equation with the right boundary values, it must be the only HUYGENS' PRINCIPLE AND GREEN'S THEOREM 307 correct solution. We must then show that the <j> denned by the integral actually has the correct boundary values. This could be done by a more careful treatment, and we should then have a demonstration of the formula. The more conventional proof, however, is a fairly direct though complicated application of Green's theorem. 189. Application to Optics. â€” We shall now take our general formula (4), and apply it to the cases we meet in optics, showing that it reduces to something like the formula which we had earlier derived intuitively. We suppose that light is emitted by a point source, and that the value of some quantity connected with, and satisfying, the wave equation (one of the components of the fields or potentials â€” they all satisfy the same relations) has the form > where ri is the distance from the source to the point where we wish to find the disturbance. Then we wish to get the disturbance at P, not by direct calculation, but by using Huygens' principle. Suppose we take a closed surface. This surface can either surround the source, or the point P where we wish the disturbance. In any case, we have n as the normal pointing out of the part of space in which P is located. At a point of the surface, <f> = , where n is the distance from the source to the point on the surface. We then have, if r is the distance from P to a point on the surface, <-9- J^ e 2viv[tâ€” (r+r t )/c] n d<f>(t â€” r/c) _ ZirivAe 2 â„¢^-^^' ] â– dt 7i d<f>(t â€” r/c) / 1 2iriv\ e M * t -l r + r *>/e] s = -iÂ« (Â«, n) {- + _j _ , Thus finally ~(^ + ? 7 !: )cos(n,r 1 )|^. (5) In this formula, as in Chap. XXV, we have two sorts of terms, some significant at small values of r and r lf others at large. 308 INTRODUCTION TO THEORETICAL PHYSICS We easily see that, if r and r x are large compared with a wave length, as is always the case in optics, the only terms we need retain are those in Hence to this approximation * = I I W~ e 2 â„¢ [ '~ (r+ri)/c] [ cos (n, r) - cos (n, n)] dS. (6) This final form suggests our earlier, intuitive formulation of Huygens' principle. The incident amplitude at dS is Now we set up, starting from dS, a wavelet whose amplitude is this value, retarded by the amount r/c, divided by r, and multiplied by the factor jâ€” [cos (n, r) â€” cos (n, r{)]dS. This is just what we should expect, except for the last factor. The term i introduces a change of phase of 90 deg., not present in Huygens' form of the principle, but necessary. The term cos (n, r) â€” cos (n, n) makes the wavelets have an amplitude which depends on angle. When r and r x are in opposite direc- tions, which is the case when the surface is between the source and P, the factor approaches 2, while when r and r x are parallel, and the surface is beyond P, it becomes zero. This means that the wavelets do not travel backwards, thus removing the diffi- culty noticed earlier in Huygens' method. The wavelets have an amplitude depending on their wave length, decreasing for the longer wave lengths. 190. Integration for a Spherical Surface by Fresnel's Zones. â€” Let us now carry out our integration, and verify Huygens' method, in a simple case. We take the surface to be a sphere, surrounding the source, and therefore a wave front. We note that n is the inner normal of the sphere. Thus r\ is constant aii over the sphere, and cos (n, n) = â€” 1 at all points, so that the formula simplifies to * = â€” 2x^ â€” J J ~~T~ [cos (n ' r) + ] Now suppose we introduce, as a coordinate on the sphere, the distance r from the point P; that is, we cut the sphere with spheres concentric with P, laying off zones between them, as in Fig. 50. We can easily get the area between r and r + dr, and hence the element of area. Take as an axis the line joining HUYGENS' PRINCIPLE AND GREEN'S THEOREM 309 the source and the point P, and consider a zone making an angle between and + dd with the axis. The area of the zone is 27rri 2 sin dd. But now by the law of cosines, if R is the distance from the source to P, r 2 = R 2 + n 2 - 2Rr x cos 0, and differ- entiating, 2rdr = 2Rr x sin dd. Hence for the area of the zone we have * x dr. Introducing this, we have K / r max e -^r A[cos{nfr) + 1]dr) r min where r m { n = R â€” r i} r ma x = R + n. To carry out this integration, we use a device called Fresnel's zones, giving us an approximate value in a very elementary way. rite Fig. 50.â€” Construction for Fresnel's zones on a sphere surrounding the source. Beginning with râ„¢, we take a set of zones such that the outer edge of each corresponds to a value of r just half a wave length greater than the inner edge. The contributions of successive zones will almost exactly cancel. The integral, then, consists of a sum of terms, say si - s 2 â€¢ â€¢ â€¢ + sÂ», where the magnitudes of Si, s 2 . . . , vary only very slightly from one to the next. Now it is true in general that in such a series the sum is approximately half the sum of the first and last terms. We can see this as follows. We group the terms T + ( $T â€” S2 ~*~2/ ' ' ' ~*~ f^l â€” Sn _ 1 _|_ ?M + p. Now, on account of the slow varia- tion of magnitude, we have very nearly Sk = Â» ^ **" s were so, however, each of the parentheses would vanish, leaving only Si ~T Sn - In our case, the contribution of the first zone is to be considered, but that of the last zone is practically zero, on account of the factor cos (n, r) + 1, so that the result is half the first zone. 310 INTRODUCTION TO THEORETICAL PHYSICS Now, in the first zone, cos (n, r) + 1 is so nearly equal to 2 that we can take it outside the integral, obtaining Jr X R jR~r, A y,2viv(tâ€”ri/c) 5 A^vivCtâ€”B/c) R-ri + \/2 R-ri ,â€” 2jri(flâ€” n)/\ â€” , the correct value. (7) it 191. The Use of Huygens' Principle. â€” In the derivations of this chapter we have traveled in a very roundabout way to reach a very obvious result. We naturally ask, what is Huygens' principle good for, aside from a mathematical exercise? The answer is found in the problem of diffraction. There one has certain opaque screens, with holes in them, and a light wave fall- ing on them. If the light comes from a point source, geometrical optics would tell us that the shadow of the screen would have perfectly sharp edges. But actually this is not true; there are light and dark fringes around the edge of the shadow. If the shadow is observed at a greater and greater distance, these fringes get proportionally larger and larger, until they entirely fill the image of the hole. Finally at great distances the fringes grow in size until the resulting pattern has no resemblance at all to the geometrical image. There are then two general sorts of diffrac- tion: first, that in which the pattern is like the geometrical image, but with diffuse edges, and which is called Fresnel diffraction; secondly, that in which the pattern is so extended that it has no resemblance to the geometrical image, and which is called Fraun- hofer diffraction. Both types of diffraction, as well as the inter- mediate cases, can be treated by using Huygens' principle. 192. Huygens' Principle for Diffraction Problems. â€” Suppose that light from a point source falls on a screen containing aper- tures, and that we wish the amplitude at points behind the screen. Then we surround the point P, where we wish the field, by a surface consisting of the screen, and of a large surface, perhaps hemispherical, extending out beyond P, and enclosing a volume completely. We apply Huygens' principle to the surface. In doing so, we assume (1) that the amplitude of the incident wave, at points on the apertures, is the same that it would be if the HUYGBNS' PRINCIPLE AND GREEN'S THEOREM 311 screen were absent; and (2) that immediately behind the screen, and at points of the hemispherical surface as well, the amplitude is zero, the wave being entirely cut off by the screen. This is, of course, an approximation, since at the edge of a slit, for exam- ple, the amplitude of the wave does not suddenly jump from zero to a finite value. The exact treatment is exceedingly difficult, but in the one case for which it has been worked out, it substanti- ates our approximations. To find the disturbance at P, then, we integrate over the sur- face, but set the integrand equal to zero, except at the openings of the screen, obtaining C CiA 1 * = J J 2^ e2 "" [ '" (r " Kl>A1[cos ( n > r > ~ cos &> r ^ dS > the integral being over the openings. We note that only the edges of the openings are significant, the shape of the screen away from the opening being unimportant. Now let us assume, as is almost always true in practice, that the distances r x and r, from source to screen and from the screen to P, are large compared with the dimensions of the holes. Then \/rr x and [cos (n, r) â€” cos (n, ri)] are so nearly constant over the aperture that we may take them outside the integral, replacing r and r x by mean values f and f i. If in addition we write r + n in the exponential as f + f i + r' + n', where r' and r x ' are the small differences â€¢between r and r t and their values at some mean point of the aper- ture, we have finally * = 2~X W x 1 - C0S ^ ^ ~ C0S ^ fl ^ Â« 2 â„¢1 r c ri) J f Ce~ M(r '+ r ^/^dS. (8) The whole factor outside the integral may be taken as a constant factor so that, if we are interested only in relative intensities, we may leave it out of account. We finally have a sinusoidal vibration of which the amplitudes of the components of the two phases are proportional to C = | jcos â€” (/ + n') dS, and S' = I I Sin ~X ^ r ' "*" ri ^ d ^' â€¢ Hence tne intensity is proportional to C" 2 + S' 2 , and our task is to compute this value. 193. Qualitative Discussion of Diffraction, Using FresneFs Zones. â€” By using Fresnel's zones, one can see qualitatively the 312 INTRODUCTION TO THEORETICAL PHYSICS explanation of the diffraction fringes, particularly in Fresnel diffraction. Suppose that we join the source S and a point P with a straight line, as in Fig. 51, and consider the point of the screen cut by this line, a point for which r + r x has a minimum value. Let us surround this point by successive closed curves in which r + ri differs from its minimum value by successive whole numbers of half wave lengths. It is not hard to see that these curves will be the intersections with the screen of a set of ellipsoids of revolution, whose foci are S and P. Hence if the line SP is approximately normal to the screen, the curves will be approxi- mately circles. Successive zones included between successive Elipsoid r x +r~ constant Fig. 51. â€” Fresnel's zones on a plane. curves will propagate light differing by a half wave length from their neighbors. Now on the screen we may imagine the pattern of zones, and also the apertures. The whole nature of the diffrac- tion depends on what zones are uncovered, and can transmit light, and what ones are obscured by the screen. We may distinguish three eases, shown in Fig. 52 : 1. The center of the system of zones lies well inside the aper- ture. The central zone is entirely uncovered, as are a number of the others. As we get to larger zones, we shall come to one of which a small part is covered; then one which is more covered; and so on, until finally we come to one only slightly uncovered; and then the rest are entirely obscured. Now we can write our integral, as in paragraph 190, as a sum of integrals over the successive zones. As before, these contributions will decrease very gradually from one zone to the next. When we reach the HUYGEN'S PRINCIPLE AND GREEN'S THEOREM 313 zones that are obscured, the decrease will become a little more rapid, but not so much as to interfere with the argument. We can still write the whole thing as half the sum of the first and the last zones. In our case, the last zone which contributes has a negligibly small area exposed, so that it contributes practically nothing, and the whole integral is half the first zone. But this gives just the intensity we should have in the absence of the screen. 2. The center of the zone system is well behind the screen (P is in the geometrical shadow). Then the first few zones are 2 3 Fig. 52. â€” Fresnel's zones and rectangular aperture. (1) Directly in path of light. (2) In geometrical shadow. (3) On edge of shadow. obscured. A certain zone begins to be uncovered, until finally some zones are uncovered to a considerable extent. Large zones become obscured again, however. Thus in our sum, while there are terms different from zero, both the first and the last terms are zero, so that the sum is zero. The intensity well inside the geometrical shadow is zero. 3. The center of the zone system is near the edge of the screen. Then the first zone may be partly obscured, so that there is some intensity, but not so great as without the screen. Or the first zone may be entirely uncovered, but the next ones ' partly obscured. In these cases, the contributions from the successive zones may differ so much that our rule of taking the first and last terms is no longer correct. It is possible for the whole amplitude to be more than half the first zone, so that the intensity is actually greater than without the screen. As we move into the geometri- cal image from the shadow, it turns out that there is a periodic 314 INTRODUCTION TO THEORETICAL PHYSICS fluctuation, on account of the uncovering of successive zones, and this explains the diffraction fringes. Problems 1. Try to carry out exactly the integration which we did approximately by using Fresnel's zones. 2. The source is at infinity, so that a wave front is a plane. Set up Fres- nel's zones, and find the breadth of the nth zone, and its area. 3. A plane wave falls on a screen in which there is a circular hole. Inves- tigate the amplitude of the diffracted wave at a point on the axis, showing that there is alternate light and darkness as either the radius of the hole increases, or as the point moves toward or away from the screen. (Sugges- tion: the integral consists of a finite number of zones.) 4. A plane wave falls on a circular obstacle. Show that at a point behind the obstacle, precisely on the axis, there is illumination of the same intensity which we should have if the obstacle were not there. Explain why this would not hold for other shapes of the obstacle. 6. Take a few simple alternating series, as 1/2 â€” 1/3 + 1/4 â€” 1/5 â€¢ â€¢ â€¢ , 1/2 - 1/4 + 1/8 â€¢ â€¢ â€¢ , 1/22 _ 1/32 + !/ 4 2 -..-., etc., and find whether our theorem about the sum of a number of terms is verified for them. In doing this, it may be necessary to start fairly well out in the series, so as satisfy our condition that successive terms differ only slightly in magnitude. 6. Prove the statement that the boundaries of Fresnel's zones are the intersection of the screen with ellipsoids of revolution whose foci are the source and the point P. What happens to these ellipsoids as the source is removed to infinity? CHAPTER XXVII FRESNEL AND FRAUNHOFER DIFFRACTION In the present chapter we proceed to the mathematical dis- cussion of Fresnel and Fraunhofer diffraction, based on the methods of Huygens' principle derived in Chap. XXVI. The problems which we take up are Fresnel and Fraunhofer diffrac- tion through a slit; Fraunhofer diffraction through a circular aperture; and the diffraction grating, an example of Fraunhofer diffraction. In Eq. (8) of the last chapter, we have seen that the essential step in computing the diffraction pattern is the evalua- tion of the integral where the integration is over the aperture of the screen, dS is an element of surface in the aperture, r is the distance from the source to the element dS, and r x the distance from the element to the point P where the field is being found. If the incident wave is a plane wave, and the plane of the aperture is a wave front, then r is the same for all elements, and the factor e~ 2 * ir/x can be cancelled out of the integral. The remaining integral, jj e -2wi ri /\ dg } represents the sum at P of the amplitudes of spherical waves of equal intensity and phase starting from all points of the aperture. It is the interference of these waves which produces the diffraction pattern. 194. Comparison of Fresnel and Fraunhofer Diffraction. â€” The two types of diffraction, Fresnel and Fraunhofer, arise from observing the pattern near to, or far from, the screen. Let the normal to the screen be the z axis, as in Fig. 53, and let the screen containing the aperture be at z = 0. The light passing through the aperture is caught on a second screen at z = R. Physically, the diffraction pattern has the following nature: close to the aperture, the light passes along the z axis as a column or cylinder of illumination, of cross section identical with the aperture, so that, if the screen at R is close to the aperture, the illuminated region will have the same shape as the aperture, and we speak of rectilinear propagation of the light. 315 316 INTRODUCTION TO THEORETICAL PHYSICS As R increases, however, the column of light begins to acquire fluctuations of intensity near its boundaries, so that the pattern on the screen has fringes around the edges. This phenomenon is the Fresnel diffraction. The size of the Fresnel fringes increases proportionally to the square root of the distance R. Thus Fig. 54 shows, in its upper diagram, the slit, parallel column of light, and parabolic lines starting from the edges of the slit, indicating the position of the outer bright fringe of the Fresnel pattern, if we are sufficiently near to the slit. As R becomes larger, the fringes become so large that there are only one or two in the pattern of the aperture, and the pattern Fig. 53. â€” Aperture and screen for diffraction through rectangular slit. shows but small resemblance to the shape of the aperture, though it still is of roughly the same dimensions. With further increase of R, we finally enter the region of Fraunhofer diffraction. Here the beam of light, instead of consisting of a luminous cylinder, resembles more a luminous cone (indicated by the diverging dotted lines in the top diagram of Fig. 54). Thus the Fraunhofer pattern becomes larger and larger as R increases, being in fact proportional to R, so that we can describe it by giving the angles rather than distances between different fringes. Often Fraun- hofer diffraction is observed, not by placing the screen at a great distance, but by passing the light through a telescope focused on infinity. Such a telescope brings the light in a given direction to a focus at a given point of the field. Thus it separates the different Fraunhofer fringes, since each of these goes out from the source in a particular direction. In Fig. 54, diffraction patterns are shown indicating the transition from Fresnel to Fraunhofer diffraction. The pattern a illustrates the Fresnel FRESNEL AND FRAUNHOFER DIFFRACTION 317 pattern for one edge of an infinitely wide slit. The patterns b to g represent the actual diffraction patterns from the slit, at distances indicated in the upper diagram. These patterns are all drawn to the same scale. They are drawn for a slit : _â€žâ€”- "â– ~"~~ ^ -_- ' Ibe c Â« i f\A/liiwwrf <b> Fig. 54. â€” Transition from Fresnel to Fraunhofer diffraction for a slit, (a) Fresnel pattern for edge of infinitely wide slit. (b)-(g) Actual diffraction patterns from slit, at distances indicated in upper diagram. (h) Fraunhofer pattern. five wave lengths wide, for the sake of getting the figure on a diagram of reasonable scale. If the wave length were shorter, then for the same slit the distances would be stretched out to the right, and the Fraunhofer pattern would correspond to smaller angular deflections. This would be necessary to bring 318 INTRODUCTION TO THEORETICAL PHYSICS the Fresnel cases far enough from the slit so that our approxima- tions would be really applicable. Finally, in h, we give the limiting Fraunhofer pattern, not drawn to scale. Let coordinates in the plane of the aperture be x, y, and in the plane of the screen at R let the coordinates be x , y , as in Fig. 53. Then, if the element of area is at x, y, 0, and the point P at x , y , R, the distance r\ between them is n = VOro - x) 2 + (*/o - y) 2 + R*. The integration cannot be performed with this expression for r lf and Fresnel and Fraunhofer diffraction lead to two different Fig. 55.-â€” ri as function of xo â€” x: n = i/(xo - x) 2 + R*. n is the distance from a point of the aperture to a point on the screen; xo â€” x is the difference between the x coordinates of the points. approximate methods of rewriting r h leading to different methods of evaluating the integral. We can see the relation of these two methods most clearly from Fig. 55, in which n is plotted as a function of x â€” x, for the special case where y â€” y = 0. The resulting curve is a hyperbola. Now in all ordinary cases, R is large compared with the dimensions of the aperture. That is, the range of abscissas representing the dimensions of the aperture from (x â€” x\ to x â€” Â£ 2 , if x\ and x 2 are the extreme coordinates of the aperture), is small compared with the distance R, the intercept of the hyperbola on the axis of ordinates. The two cases are now represented by the ranges ab and cd of abscissas, respectively. In the first, x â€” xi and x â€” # 2 are separately small, as well as their difference, and this means that the point P is almost straight behind the aperture, in the region FRESNEL AND FRAUNHOFER DIFFRACTION 319 where the Fresnel diffraction pattern occurs. In the second, x is large, of the same order of magnitude as R, showing that we are examining the pattern at a considerable angle to the normal, as we do in the Fraunhofer case. The two approximate methods can now be simply described from the curve: for Fresnel diffraction, we approximate the hyperbola near its minimum by a parabola; for Fraunhofer diffraction, we approxi- mate it farther out by a straight line. In the first case, assuming R to be large compared with (x â€” x), we have by the binomial expansion _ D 1 (so - x) 2 or including the terms in y, t _ p , 1 (x - x) 2 + (y - y) 2 , ri - R + g â€” ^ + (1) In this case, in the notation of Eq. (8) of the previous chapter, we take f = R, so that r' is the remaining term of Eq. (1). For Fraunhofer diffraction, on the other hand, we have x > > x. Then we write r x 2 = (x Q 2 + y 2 + R 2 ) - 2{xx + yy ) + x 2 + y 2 , and we can neglect the terms x % + y 2 . If we let R 2 = x 2 + y 2 + R 2 } where R Q measures the distance from the center of the aperture to the point P, we can use a binomial expansion, obtaining r, - B. - xx Â° + yyÂ° ... (2) In this case we take f = R 0) so that / is the remaining term of Eq. (2). Letting x /R = I, yo/Ro = m, the direction cosines of the direction from the center of the aperture to P, we have r' = â€” (Ix + my) â€¢ â€¢ â€¢ , involving the position on the screen only through the angles, so that we see at once that the pattern will travel outward radially from the aperture. 195. Fresnel Diffraction from a Slit. â€” Let the aperture be a slit, extending from x = â€” (a/2) to x = a/2, and from y = â€” (6/2) to 6/2. We assume a to be small, 6 comparatively large, as in Fig. 53, so that it is a long narrow slit. Using the results of Eq. (1), our integral is ff e - 2 â„¢'/*dS= f fe-â„¢U x - x o) 2 +(v-vJ*V R *dS. 320 INTRODUCTION TO THEORETICAL PHYSICS This can be immediately factored into f b/2 e ->ri<x-v t )*/R*dy f a/2 e-^ x ~ x o^ R Mx. J -6/2 * J-a/2 Since these two integrals are of the same form, we can treat just one of them. This will prove to give fringes parallel to one set of axes. The whole pattern is then simply the combination of the two sets of fringes. The single integral, for instance the one in x, has a real part, and an imaginary part (with sign changed), equal to f o/2 7r0r-zo) 2 , ' f a/2 â€¢ t(x - x ) 2 J ^ I cos â€” â€” ^r â€” â€” ax and I sin p â€” â€” ax. (3) J-a/2 It* J-a/2 K* It is customary in these integrals to make a change of variables: - â€” D = â€” â€¢ Then the integrals become v#A/2 times C and S, respectively, where C = I cos ^ u 2 du, S = I sin ^ u 2 du, Jui & Jui Â£ , , x â€” a/2 x + a/2 â„¢ . . and where U\ = â€” , > u 2 = â€” , Ihese integrals are VR\/2 VR\/2 called Fresnel's integrals. They cannot be explicitly evaluated, but their values have been computed by series methods. 196. Cornu's Spiral. â€” Let us plot the indefinite integral cos jr u 2 du as abscissa, I sin ^ u 2 du as ordinate, of a graph, o 2. jo J, as in Fig. 56. Then it is not hard to see that the resulting curve is a spiral, which is known as Cornu's spiral. To see this, we can first compute the slope. This is the differential of the ordi- nate, over the differential of the abscissa, or sin â€ž^ = tan kW 2 . t â€ž z COS -jjtt Thus, when u 2 increases by 4, the tangent of the curve swings around a complete cycle, and comes back to its initial value. Each point of the spiral corresponds to a particular value of u. We can show at once that the difference of u between two points is simply the length of the curve between the points. We show this for an infinitesimal element of the curve. The square of the element of length, ds 2 , is equal to the sum of the squares FRESNEL AND FRAUNHOFER DIFFRACTION 321 of the differentials of abscissa and ordinate, or is cos 2 1 ~u 2 \du 2 + sin' I ~u 2 \du 2 . Hence ds = du } and we can integrate to get s = Ui â€” u\. From this fact we can make sure of the spiral nature of the curve. For one turn of the curve corresponds to an increase of u 2 by 4. That is, if u', u" are the values at the two Jsinl 0.5 -- A h J7u z du -râ€” f- \ 1 0.5 Jcosljiu 2 du -0.5 Fig. 56. â€” Cornu's spiral. The points of the spiral marked by cross bars corre- spond to increments of 0.1 unit in u. ends, u" 2 = u' 2 + 4. This is u" 2 - u' 2 = 4, (w" - u')(u" + u') = 4, u" â€” v! = 4:/(u" + u'). The difference u" â€” u' is, how- ever, simply the length of the turn, so that we see that, as we go farther along, the turns become smaller and smaller, so that they eventually become zero, which is characteristic of a spiral. It is plain that the spiral is symmetric in the origin, having two points, for u = Â± Â°o , for which it winds up on itself. Let us take our spiral, mark on it the positions u x and u 2 corresponding to the limits of our integral, and draw the straight line connecting these points. The length of this line will then 322 INTRODUCTION TO THEORETICAL PHYSICS be proportional to the amplitude of the disturbance, and its square to the intensity. This is easy to see: the horizontal component of the line is just C, and the vertical component S, so that the square of its length is C 2 + S 2 . Knowing this, we can easily discuss the fluctuations of intensity, as seen in Fig. 54. As x Q changes, it is plain that u\ and u% increase together, their difference remaining fixed and equal to , â– â€¢ Thus essen- tially we have an arc of this length, sliding along the spiral, and the intensity is measured by the square of the chord between the ends of this arc. Now when x is large and negative, the arc is wound up on itself, so that its ends practically meet, and the intensity is zero. This is the situation in the shadow. As x approaches the value â€”a/2, however, w 2 approaches zero, so that one end of the arc has reached the center of the figure. There are two quite different cases, depending on whether u 2 â€” Ux is large or small. If it is large (a large slit and relatively short distance R and small wave length), then u x will still not be unwound much at this point. The chord will then be half the value between the two end points of the spiral, and the intensity will be one-fourth its value without the screen, and will have increased uniformly in coming out of the shadow. As we go farther along the x direction, however, the arc will begin to wind up on the other half of the spiral, producing alternations of intensity at the edge of the shadow. Then for a while u 2 will be nearly at one end of the spiral, U\ at the other, so that the intensity for some distance will be nearly constant, and the same that we should have without the slit. This is the illu- minated region directly behind the slit. Finally we approach the other boundary, and u\ commences to unwind. We then go through the same process in the opposite order. The other quite different case comes when w 2 â€” u\ is small, which is the case for small slit, or large wave length or distance. Then there is never a time when Ui is on one branch of the spiral and u% on the other. All through the central part oÂ£ the pattern, therefore, there are no fluctuations of intensity. Such fluctua- tions come only far to one side or the other. They come about in this way: At some places in the pattern, the arc is long enough to wind up for a whole number of turns, and the chord is practi- cally zero, while at other places it winds up for a whole number plus a half, and the chord has a maximum. The resulting fringes FRESNEL AND FRAUNHOFER DIFFRACTION 323 are the Fraunhofer fringes which we shall now discuss- by a different method. 197. Fraunhofer Diffraction from Rectangular Slit. â€” Using the approximation (2), our integral for Fraunhofer diffraction is e -2*iR 0/ X fj^iOx+my^X dS The firgt termj ag in Fregnel diffraction, contributes nothing to the relative intensities, and may be neglected. We then have ^ e 2wi{lx+my) /' K dS, as the integral whose absolute value measures the amplitude of the disturbance. Let us suppose that the aperture is the same sort of rectangle considered above, extending from â€” a/2 to a/2 along x, from â€” 6/2 to 6/2 along y. Then the integral is JÂ°' r t> / 2 (pTrila/\ pâ€” irila/\\ I p irimb/\ â€”Kimb/\\ e 2 "'*A dx e 2*ir*v/\ dy = ^ â€” -â€” i 1 K - i l - a /2 J -6/2 2iril/^ 2Trim/\ _ sin (irla/\) sin (xm6/X) irl/\ xm/X (4) The intensity is the square of this quantity. Let us consider its dependence on the position of the point P on the screen. The coordinates of this point enter only in the expressions I, m, showing that the pattern increases in size proportionally to the distance, as if it consisted of rays traveling out in straight lines from the small aperture, rather than having an approxi- mately constant size as with the Fresnel diffraction (see Fig. 54). When we consider the detailed behavior of the intensity as a function of the angle, we find that this can be written as a 2 sin 2 (irla/\) ,. â– .â– ,*.â€ž / j a /\\ 2 tunes a similar function of m, giving a curve of ,, j. sin 2 a , irla â€ž,, the form ^â€” > where a = â€” â€¢ This function becomes unitv when a = 0, goes to zero for a = x, 2tt, 3tt, â€¢ â€¢ â€¢ , with maxima of intensity approximately midway between. The maxima decrease rapidly in intensity. Thus at the points 3tt/2, 5x/2 . . . which are approximately at the second and third maxima, the intensities are only (2/3tt) 2 , (2/5tt) 2 , ... or 0.045, 0.016 . . . , compared with the central maximum of 1. Let us see how the size of the fringes depends on the dimensions of the slit. The minima come for a = rnr, or la/\ = n, I = n\/a. Thus we see that the greater the wave length, or the smaller the dimensions of the slit, the larger the pattern becomes. 324 INTRODUCTION TO THEORETICAL PHYSICS The positions of the minima can be immediately found by a very elementary argument. Assume for convenience that we are investigating the pattern at a point in the xz plane, so that m = 0. Then draw a plane normal to the direction I, passing through one edge of the aperture, as in Fig. 57. This represents a wave front of the diffracted wave, just as it passes one edge of the aperture. From the geometry of the system, this wave front is a distance la from the other edge, or la/2 from the middle of the aperture. Now, if the distance of the middle is just a whole number of half wave lengths different from the distance from the edge, the contributions of these two points to the amplitude will Fig. 57. â€” Elementary construction for Fraunhofer diffraction. just cancel, being just out of phase. The other points of one half of the aperture can all be paired against corresponding points of the other half whose contributions are just out of phase, finally resulting in zero intensity. This situation comes about when la/2 = wX/2, where n is an integer, or I = n\/a, the same condi- tion found above. Since most of the intensity falls within the first minimum, and since I is the sine of the angle between the ray and the normal to the surface, we may say that by Fraun- hofer diffraction the ray is spread out through an angle X/a. 198. The Circular Aperture. â€” The problem of Fraunhofer diffraction through a circular aperture is slightly more compli- cated mathematically. Here we must evaluate jje 2 ^ lx+my) / x dS over a circle. Let us introduce polar coordinates in the plane of the aperture, so that x = p cos 0, y = p sin 0. Further, on FRESNEL AND FRAUNHOFER DIFFRACTION 325 account of symmetry, we may take the point P to be in the xz plane, so that m â€” 0. Then if p is the radius of the aperture, the final result is f *ddf P Â°e 2 * i <> cos e l ^pdp. We can integrate with respect to p by parts, obtaining for the integral "" 2W T Po6 27 " P Â° C Â° 8 Â° ^ X (e 27ri P0 0O8 e i/\ ]V J>[ _2iri cos ei/\ (2iri cos l/\) 2 J For the integration with respect to 0, it is necessary to expand the exponentials in series. If we do this, the integrals are in each case integrals of a power of cos 0, from to 2x. These are easily evaluated, and the result, combining terms, proves to be H. 1 "" Kt) + \\y\) - 1(it) + b(it) } where k is an abbreviation for irp l/\. If we recall the formulas for BesseFs functions, we can see without difficulty that this is equal to -y-Ji( 27rporr- )â€¢ It is not hard, using some of the properties of BesseFs functions, to prove this formula directly, without the use of series. From the series, we see that the intensity has a maximum for I = 0, the center of the pattern. As I increases, we can see the behavior most easily from the expression in terms of BesseFs functions. Since J\ has an infinite number of zeros, there are an infinite number of light and dark fringes. The first dark band comes at the first zero of J h which from tables is at 2t Po I/\ = 1.2197tt, l Po /\ = 0.61. The next is at Po l/\ = 1.16, and so on, with maxima between. We see that, except for a numerical factor, the pattern from a circular aperture has about the same dimensions as that from a square aperture. Thus if the side of the square were equal to the diameter .of the circle, 2p , the first dark fringe would be at 2 p l/\ = 1, p l/\ = 0.5, and the next one at 1.0. 199. Resolving Power of a Lens. â€” Whenever light passes through a lens, it is not only refracted, but it has passed through a circular aperture, the size of the lens itself or of the diaphragm which stops it down, and as a result it is diffracted. Suppose, for example, that the lens is the objective of a telescope, and that parallel light falls on it, as from an infinitely small or distant star. Then after passing through the diaphragm, the light will no longer be a plane wave, but will have intensity in different directions, as shown in the last section. The central maximum 326 INTRODUCTION TO THEORETICAL PHYSICS will have an angular diameter of 0.61 \/p , where p is now the radius of the telescope objective. The resulting waves are just as if the light came from an object of this diameter, but passed through no diaphragm. When the telescope focuses the radia- tion, the result will be not a single point of light, but a circular spot surrounded by fringes, as of a star of finite diameter. For this reason, the telescope is not a perfect instrument, and one would say that its resolving power was only enough to resolve the angle 0.61 X/p . This is usually taken to mean the following : if two stars had an actual angular separation of this amount, the center of the image of one star would lie on the first dark fringe of the other, and the patterns would run into each other so that they could be just resolved. We see that the larger the aperture of the telescope, or the smaller the wave length, the better is the resolution. The same general situation holds for microscope lenses. 200. Diffraction from Several Slits; the Diffraction Grating. â€” Suppose we have a number N of equal, parallel slits, equally spaced. Let each have the width a along the x axis, and let the spacing on centers be d, so that the centers come/at x = 0, d â€¢ â€¢ â€¢ (N â€” l)d. Now let us find the Fraunhofer pattern. The part of the integral depending on y will be just as with the single slit, and we leave it out of account. We are left with f /2 ^/Nfe + f d+a/2 e^^dx + â€¢ â€¢ â€¢ + CT****"****- J -a/2 Jd-a/2 J(N-l)d-a/2 But this is, as we can immediately see, simply a / 2 e 2lrilx ^dx(l + g2ir*ta/X l & 2vil2d/\ _|_ . . . _|_ e 2inHN~\)d/\\ _ a/2 By the formula for the sum of a geometric series, this is J_ a/ e 2vilx/x (1 _. e 2irilNd/\\ 2^Wx" )" ke* the first term be A, the amplitude due to a single slit, which we have already evaluated. Now to find the intensity we multiply this by its conjugate, which gives 2 1 - cos {2irlNd/\) = . 2 sin 2 (rlNd/\) . . A 1 - cos (2ttW/X) sin 2 {irld/\) ' W That is, with N slits the actual intensity is that with one slit, but multiplied by a certain factor. This factor goes through zero when lNd/\ is an integer, so that I equals an integer multi- plied by \/Nd. This gives fringes with a narrow spacing, charac- FRESNEL AND FRAUNHOFER DIFFRACTION 327 teristic of the whole distance Nd occupied by the set of apertures, crossing the other pattern, and they are what are usually called interference fringes, since they are due, not to diffraction from a single aperture, but to interference between different apertures. But in addition to this, the denominator results in having these fringes of different heights. .The minimum height occurs when the denominator equals unity, when the fringes are of height A 2 , and the most intense fringes come when the denominator is zero. Here the ratio of numerator to denominator is evidently finite, and gives fringes of height N 2 A 2 . Thus the greater N is, the greater the disparity in height between the largest and smallest maximum. Evidently every iVth maximum will be high, and the high ones will be spaced according to the law ld/\ = k, an integer. Now suppose N becomes very great, as in a diffraction grating. Then the small maxima will become so weak compared with the strong ones that only the latter need be considered. The latter will seem to consist of a set of sharp lines, with darkness between. These sharp lines come, as we have seen, at angles to the normal given by k\ = d sin 0, where k is an integer, and sin â€” I. This is the ordinary diffraction grating formula, where k is for the central image, 1 for the first-order spectrum, 2 for the second order, etc. But we cannot entirely neglect the fact that thSre are other small maxima near the important ones. Thus for ld/\ = k, the intensity is N 2 A 2 . This comes for lNd/\ = Nk. But for IN d/\ = Nk + %, we again have a secondary maximum, whose A 2 A 2 A 2 height is now ^ = 71 r= â€” r-^-r, = -. â€” - . sm T sm x NT^ sm \ k + 2NJ Now sin 2 (^ + 9^) = (oXf) approximately, if iV is large, so that the height of the maximum is 4N 2 A 2 /9t 2 , or about 0.045 of the height of the highest maximum. Thus the first few second- ary maxima cannot be neglected. To get an idea of the width of the region through which the intensity is considerable, we may take the width of the first maximum. From the center to the first dark fringe, this is given by the fact that at the center lNd/\ = Nk, at the dark fringe = Nk + 1, so that Al = \/Nd. This is closely connected with the resolving power of a grating. For a single frequency gives not a sharp set of lines, one for each order, but a set broadened by the amount we have found. Thus 328 INTRODUCTION TO THEORETICAL PHYSICS two neighboring frequencies, differing by AX, could not be resolved if the first minimum of one lay opposite the maximum of the other. Since Z = \k/d, this would be the case if Al = AX/b/rf = \/Nd, or if AX/X = 1/Nk. The resolving power thus increases as the number of lines in the grating increases, and as the order of the spectrum increases. Problems 1. Carry through a discussion of Fresnel diffraction from a slit, when the source is at a finite distance, directly behind the center of the slit. In what ways will the result differ from the case we have discussed? 2. Light of wave length 6,000 A. falls in a parallel beam on a slit 0.1 mm. broad. Work out numerical values for the intensity distribution across the slit, at three distances, first, in which the Fresnel fringes are small compared with the size of the pattern, second in which they are of the same order of magnitude, and third, in which they are Fraunhofer fringes. Either con- struct Cornu's spiral yourself, from tables of Fresnel's integrals, or use the one of Fig. 56. 3. Find the coordinates of the points at which Cornu's spiral winds up on itself. From the chord between these points, compute the intensity behind an infinity broad slit, which essentially means no slit at all. Find whether this agrees with what you should expect it to be. 4. Prove that the maxima of the function sin 2 (irla/X) sin 2 a (xZa/X) 2 ~ a. are determined by the equation a. = tan a. Find the first three solutions of this transcendental equation and compare them with the approximate solutions a = 3ir/2, 5ir/2, 7ir/2. 5. Discuss the Fresnel diffraction pattern caused by an edge coincident with the y axis, the screen occupying one-half the xy plane. The diffraction pattern is obtained in a plane parallel to the xy plane and a distance R from it. Plot the variation of intensity of light along the x direction from a region inside the shadow to well into the directly illuminated area. Prove that the intensity, of light just at the edge of the geometrical shadow is one-fourth of its value if there were no diffraction edge. 6. Evaluate the Fresnel integrals f" cos ^uHu and i sin ^uHu in a power series. What is the range of convergence of these series? 7. Evaluate the Fresnel integrals in series of the form cos 2 =Â«Si + sin 2 2 mS 2, where Si and S 2 are power series in u. What is the range of convergence of these series? 8. Find a semiconvergent series for the Fresnel integrals of the same form as in Prob. 7 where the power series are now in inverse powers of u. (Hint: Write f Â°Â° cos xHx = f Â°Â° x cos x 2 â€” and integrate by parts, repeating the process.) Calculate the remainder in these series after the nth term. Show that this is smallest when n is about x 2 /2. CHAPTER XXVIII WAVES, RAYS, AND WAVE MECHANICS The beautiful success of the wave theory in explaining diffrac- tion patterns, which we have been discussing in the last chapter, has been the best proof of the correctness of this theory. But the proof has not always gone unchallenged. Ever since the time of Newton, at least, there has been a rival theory, the cor- puscular theory. Newton imagined 1 ght to consist of a stream of particles. These particles, or corpuscles, traveled in straight lines in empty space, and were reflected by mirrors as billiard balls would be by walls, making equal angles of incidence and reflection. Refraction was explained by supposing that different media had different attractions for the corpuscles. Thus glass would attract them more than air, the potential energy of a corpuscle being constant within any one medium, but being lower in glass than in air, so that the corpuscles would have a normal component of acceleration toward the glass, without correspond- ing tangential acceleration, and would be bent toward the normal on entering the glass. By working out this idea, the law of refraction easily follows. Newton was aware of the wave theory â€¢ Huygens was advocating it at the time. But his objection was that light travels in straight lines, whereas the waves he was familiar with, waves of sound or water waves, certainly are bent out in all directions on passing through apertures. Newton considered this to be a fatal objection to the wave theory. The answer to this objection, of course, came later with the quantitative investigation of diffraction. In the preceding chapter, we have seen that a plane parallel wave, falling on a small aperture of dimension a, does not form a perfectly parallel ray after emerging from the hole. On the contrary, it spreads out, first by forming fringes on the edges of the ray (Fresnel diffrac- tion), then at greater distance by developing a conical form/with definitely diverging rays (Fraunhofer diffraction). The angle of this cone is of the order of magnitude of X/a, where X is the 329 330 INTRODUCTION TO THEORETICAL PHYSICS wave length. Newton was tacitly assuming that the wave length, as with sound, was large, that X/a would be large for a small slit, and there would be large spreading out and a com- pletely undefined ray. But it was found early in the nineteenth century that the wave length was really so small that, with apertures of ordinary size, we can neglect diffraction, and obtain an almost perfectly sharp ray, a band of light separated from the darkness by sharp, straight edges. 201. The Quantum Hypothesis. â€” More recently, in the present century, a more serious argument for a corpuscular theory has appeared. This is the hypothesis of quanta, originated by Planck in discussing the radiation from a heated black body. The most graphic application of this hypothesis was made by Einstein to the theory of the photoelectric effect. It is known that light of frequency v, falling on a metal surface, liberates electrons, as for example in the photoelectric cell. Now the law of emission is remarkable: the energy of each emitted electron, independent of the intensity of the light, is a definite amount proportional to the frequency, hv, where h is Planck's constant, equal to 6.54 X 10- 27 in c.g.s. units, introduced by him in his first discussion. This energy of the emitted electron is really decreased by the amount of energy it loses in penetrating the surface, so that hv will act as a maximum energy, rather than the energy of each electron. Of course, the total emission is proportional to the intensity of the light, but increasing the intensity increases the number of electrons, not their energies. Einstein's hypothesis to explain the photoelectric effect was that the energy of the wave was not to be computed in a continu- ous manner by Poynting's vector, but that it was localized in little particles or corpuscles (now called photons), each of energy hv. Then it would be perfectly obvious that if no photon fell on a spot of the metal, no electron would be ejected; but that a photon which happened to fall on a given place would transfer all its energy to an electron, being absorbed, and ceasing to exist as light. The intensity of light would be measured simply by the number of photons crossing an arbitrary surface per second, times the energy carried by each photon. Einstein's hypothesis found many supports. One of these comes from the structure of atoms. Atoms emit monochromatic spectrum lines, falling often into regular series. Bohr was able to explain this, at least in hydrogen, the simplest atom, by assum- WAVES, RAYS, AND WAVE MECHANICS 331 ing that the atom was capable of existing only in certain definite stationary states, each of a definite energy. He supposed that radiation was not emitted continuously, as the electromagnetic field from a rotating or vibrating particle would be, but that the atom stayed in one energy level until it suddenly made a jump to a second, lower, level, with emission of a photon. If the higher energy is E 2 , the lower E 1} the energy of the photon would be E 2 â€” Ei, so that its frequency would be E 2 /h â€” Ei/h. This formula has proved to be justified by great amounts of experi- mental material. First, it states that the frequencies emitted by atoms should be the differences of "terms" E/h, each referring to an energy level of the atom. This is found to be true in spec- troscopy, and has been the most fruitful idea in the development of that science. Even tremendously complicated spectra can now be analyzed to give a set of terms, and the number of terms is much less than the number of lines, since any pair of terms, subject to certain restrictions, gives a line. But also, Bohr was able to set up a system of mechanics to govern the hydrogen atom, very simple in its fundamentals, though different from classical mechanics, which gives a very simple formula for the energy levels, agreeing perfectly with the extremely accurate experimental values. Bohr's idea of stationary states, in turn, was tested by experiments on electron bombardment. It was found that an atom in state of energy Ei could be bombarded by an electron. If the electron's energy, as determined from the electrical difference of potential through which it had fallen, was less than E 2 â€” Ei, where E 2 is the energy of the upper state (we consider only one), it would bounce off elastically, without loss of energy. But if its energy was E 2 â€” E lf or greater, it would often raise the atom to the upper state, which could be proved by subsequent radiation by the atom, and would lose this amount of energy itself. This definitely verified the existence of sharp energy levels in the atom. At the same time, it furnishes an example of a very interesting phenomenon. An electron bom- bards an atom, loses energy E 2 â€” Ei. This energy is emitted as a photon hv. The photon falls on a metal, is absorbed, ejects a photoelectron of energy E 2 â€” Ei (minus a little, for the work of coming through the surface). The photoelectron bombards an atom, loses its energy, which goes off as a photon. Energy, in other words, passes back and forth from electrons to photons 332 INTRODUCTION TO THEORETICAL PHYSICS indiscriminately. If electrons are particles, surely photons are too. 202. The Statistical Interpretation of Wave Theory. â€” All these phenomena suggesting photons, and a corpuscular structure for light, must not cause one to forget that light still shows inter- ference, and that the arguments for the wave theory are as strong as ever. Various attempts were made to set up laws of motion for the photons, which would lead to the correct laws of interfer- ence and diffraction (Newton had already done it for refraction), but without success. We can see easily why this should be so. Consider very weak light, so weak that we only have a photon every minute, for example, going through a diffraction grating. Such weak light, we know experimentally, is diffracted just like stronger light. But that means, as we saw in the last chapter, that the resolving power depends on a cooperation of the whole grating; if half of it were shut off, its resolving power would be decreased, and the intensity distribution changed. Even the single photon shows evidence of the full resolving power, in that if we make a large enough exposure to have many photons, so that we can develop the photograph and measure the blackening, which surely measures the number of photons which have struck the plate, we find the full resolving power of the grating in the final photograph. But it is difficult to imagine any law of motion of a photon which will depend on rulings over the whole face of a grating, if the photons went through only one point of it. After such difficulties, the theory that has emerged is a com- bination of wave theory and corpuscular theory. It is assumed that atoms emit wave fields as in the electromagnetic theory, emitted by certain oscillators connected with the atom, and vibrating with the emitted frequencies. These waves do not carry energy, but serve merely to determine the probable motion of the photons. The rate of emission of waves by the oscillator determines the probability of emission of photons. The Poyn- ting's vector at any point of the radiation field determines the probability that a photon will cross unit cross section normal to the radiation, per second. If the oscillator is damped with time, that indicates that the probability of emission of a photon decreases with time; that is, that the probability that the atom is in its upper, excited state, from which it could emit the radiation, is decreasing with time. One can carry such a probability con- nection through in detail. WAVES, RAYS, AND WAVE MECHANICS 333 Probably the most graphic picture of the probability relation between photons and waves is obtained if we imagine very weak light, in which photons come along one in several seconds, forming a diffraction pattern. The diffraction pattern is assumed to be on a screen which is capable of registering the individual photons as they come along. This screen might be a photographic plate, in which a single photon is enough to make a grain developable, or it might be a screen having slits opening into Geiger counters or other devices for registering individual photons. Of course, the only way of detecting that there was light falling on the screen would be to detect the photons. First, one photon would strike the screen, in one spot, then another photon in another spot, and so on. So long as there were only a few photons, the arrangement might seem to be haphazard. But as more and more photons were present, we could find where they were densely distributed, and where there were only a few. It would then prove to be the case that the places where photons were dense were just those places where the wave theory predicted a large intensity, and the places where there were no photons were those where the wave theory indicated darkness. 203. The Uncertainty Principle for Optics. â€” It is characteristic of the theory that no law of motion of photons is assumed beyond this probability; according to the present view, no such detailed laws exist. Given a plane monochromatic wave of light, we know exactly the energy of each photon (hv), and its momentum (this' proves to be hv/c = h/\, pointing in the direction of the wave normal), but, if the intensity is uniform over space, we have no information as to the position of the photon. If we let the plane wave fall on a slit of width a, the light passing through will be more defined as to its position in space. It will be in the form of a small ray or beam, spreading by diffraction, but still, in the region of Fresnel diffraction, of width approximately a. Thus, if x is the coordinate along the wave normal, y the coordinate at right angles, the photon will surely be in a beam whose length along the x axis is infinite, but of width only about a along the y axis, as in Fig. 58. That is, the uncertainty in the y coordinate has been reduced to a: Ay = a, if Ay is the uncertainty. At the same time, however, a, compensating uncertainty in the momen- tum has appeared. The wave is now spreading, the wave nor- mals making angles up to about X/a with the x axis, as shown in Sec. 197. Thus, if the whole momentum remains p = h/\ 334 INTRODUCTION TO THEORETICAL PHYSICS this will have a component along y, equal to p times the sine of the angle between the momentum and the x axis, or approxi- mately p\/a = h/a. But we do not know which angle, up to the maximum, the actual deviation will make, for all we know is that the photon is somewhere in the diffraction pattern. Hence the uncertainty in y momentum is of this order of magnitude of h/a. If we call it Ap y , we have the relation AyAp y = ?- = h. {1) This is an example of the uncertainty principle, concerning the amount of uncertainty inherent in the description of the motion Fig. 58. â€” Uncertainty principle in diffraction through slit. Ap __X_ V ~ &q (Compare Fig. 54, top diagram). â€” = â€” , ApAg = Ap = of photons by the probability relations with wave theory. Further examination indicates that this law is very general: where a beam is limited to acquire more accurate information about the coordinates of the photon, we make a corresponding loss in our knowledge as to its momentum, and vice versa. A similar relation holds between energy and time. Suppose we have a shutter over our hole, and open it and close it very rapidly, so as to allow light to pass through for only a very short interval of time At. Then the wave on the far side is an inter- rupted sinusoidal train of waves, and we know by our Fourier analysis, as in Sec. 185, that the frequency is no longer a definitely determined value, but is spread out through a frequency band of breadth Av % given by Av/v = 1 /(number of waves in train). WAVES, RAYS, AND WAVE MECHANICS 335 Now the number of waves in the train is cAt, the length of the train, divided by X. Hence Av/v = \/{cAt), AvAt = 1. Using E = hv, we have AEAt = h, (2) an uncertainty relation between E and t, showing that energy and time are roughly equivalent to momentum and coordinate: if we try to measure exactly when the photons go through the hole, their energy becomes slightly indeterminate. Further, here we know that the x coordinate is now determined, at any instant of time, with an accuracy cAt: the photon must be in the little puff of light, or wave packet, sent through the pinhole while the shutter was open. Thus Ax = cAt. But now the x component of momentum, which to the first order is the momen- tum itself, is uncertain. For p x = p = â€” > Ap x = -Av = h/(cAt) = h/Ax, so that AxAp x = h, (3) again the uncertainty relation. We can, in other words, make our wave packet smaller and smaller, until it seems almost like a particle itself, and its path is the path of the photon. The wave packet will be reflected and refracted, just as large waves would be, giving the laws of motion of photons in refracting media. But if we try to go too far, making the wave packet too small, we defeat our purpose, and make it spread out by diffraction. We cannot, that is, get exactly accurate knowledge about the laws of the photon's motion from the probability relation. In some cases, this is even more obvious than here. Thus, if a wave packet is sent through a diffraction grating, it will spread out much as a plane wave would, into the various orders of the diffraction pattern. We cannot, then, make any prediction at all, except a statistical one, as to which order of the pattern a given photon will go to. We completely lose track of the paths of individual photons in a diffraction pattern. 204. Wave Mechanics. â€” It is now a remarkable fact that many indications point out that there is the same dualism between waves and particles in mechanics that there is in optics. We have seen one in the way energy passes from electrons to photons, and back again. We can paraphrase our earlier remark by saying that surely if photons are connected with waves, electrons are connected with waves too. But there are more substantial 336 INTRODUCTION TO THEORETICAL PHYSICS reasons. In discussing the statistical relation of waves and photons, we mentioned that the electromagnetic waves were produced by oscillators, and it appears that these oscillators have only a statistical relation to the atoms. Thus we noted that the oscillators connected with radiating atoms would be exponentially damped, while the atoms were discontinuously jumping from an excited state to a lower state from which they did not radiate. This suggests a statistical connection between the oscillators and the atoms or electrons, the number of atoms in the excited state at any instant being related to the instan- taneous amplitude of the corresponding oscillators, as the number of photons is related to the amplitude of the electromagnetic wave. But there are two compelling reasons which have led to the acceptance of the connection between the motion of particles and waves. The first was the experimental proof, by Davisson and Germer, G. P. Thomson, and others, that electrons can show the same sort of diffraction effects that light shows, being diffracted by crystals, and even by ruled gratings. The second was the fact, discussed by de Froglie and developed by Schrodinger, that the stationary states of atoms and molecules correspond to the various overtones of a standing wave system. Thus. the waves associated with particles not only can have progressive form, connected with particles traveling along, but can also exist as standing waves, and these are precisely the oscillators which are statistically connected with the atoms, and which represent the stationary states of Bohr's theory. We shall elaborate the theory of these stationary states in succeeding chapters. It is definitely settled, then, that mechanics is just as much a wave phenomenon as optics is. The wave mechanics leads to Newtonian mechanics as a limiting case, just as the wave theory of light leads to geometrical optics, where one treats rays only, and where one can assume that the light consists of particles following fixed paths and moving according to fixed laws. Our work, so far in this book, has been divided roughly into two sections, mechanics, and the electromagnetic theory and optics. We now commence a third section, of equivalent importance, on wave mechanics. But as the standing waves of wave mechanics are often the atoms themselves, it is natural that our treatment should be intimately bound up with the struc- ture of matter, a subject which one can mostly leave out in WAVES, RAYS, AND WAVE MECHANICS 337 speaking of mechanics or optics, but which is of the very essence of the problem with wave mechanics. 205. Frequency and Wave Length in Wave Mechanics. â€” If we are considering a mechanical particle of energy E, momentum p in a given direction, we assume that associated with it is a wave (of course, not a light wave or a vibrational wave of a material medium ; we are now accustomed in physics to the idea of purely mathematical waves, without reference to any medium) whose frequency v and wave length X are given by the equations E = hv, p = * (4) the wave normal being in the direction of motion of the particle. The reason why one ordinarily is not conscious of the wave nature of mechanics is the extraordinarily small wave length involved. A particle of mass 1 gm., moving with velocity 1 cm. per second, 'has a wave length given by h/\ = mv = 1, X = h/1 = 6.54 X 10 -27 cm., exceedingly small compared with all ordinary dimensions. If such a particle passed through a pinhole, the corresponding wave would be diffracted, but the angle of spreading would be extremely small. With other magnitudes for the mass, however, the diffraction effect can become important. Thus an electron, of mass 9 X 10 -28 gm., moving, for example, with a velocity of 10 8 cm. per second, has a wave length of 9 x 10 - 28 x 1Q8 = 7.3 X 10" 8 cm, a quantity of atomic dimensions. Thus if the electron passed through an aperture of atomic size, as a hole between atoms, it could be diffracted through a large angle. It is then evident that diffrac- tion of electrons on an atomic scale is important; in fact, we shall see in the next chapter that this is just why the atomic scale is what it is. 206. Wave Packets and the Uncertainty Principle. â€” Just as with light, we assume a statistical relation between the intensity of the wave and the probability of finding the particle at the corresponding point. A uniform infinite monochromatic plane wave corresponds to a particle traveling with a definite energy and momentum in a definite direction, but whose position is entirely unknown. Such a mechanical system would be approxi- mated by electrons which had been all accelerated to the same speed in a vacuum tube, but whose individual positions we did not 338 INTRODUCTION TO THEORETICAL PHYSICS know. If we wished to fix the positions, we could let the beam of electrons fall on a screen containing a pinhole. Then any electron found on the far side would have gone through the pin- hole, so that we would know its y coordinate with an uncertainty Ay (using the same coordinates as with the optical case, x normal to the screen, y in the plane of the screen). After passing through, the electrons would travel practically in a straight line; but the ray will be deviated on account of diffraction, and since the law of motion of the electron is not definitely fixed, but is merely a probability law connecting it with the wave, there will be an uncertainty in its y momentum, given by AyAp y = h. Similarly if we try to determine the x coordinate of the electron by opening and closing a shutter, so that we know exactly when it went through the hole, we thereby introduce a broaden- ing into the spectrum of the wave, hence an uncertainty in wave length of the particle, and finally in its x component of momen- tum, given by AxAp x = h. Thus the principle of uncertainty operates with particles as with photons. The wave packet, as set up in this way, may be made extremely small without diffraction, if the wave length is as small as it often is. Thus with a particle of the mass of familiar objects, the wave function representing the motion of its center of gravity can be concentrated in a region much smaller than atomic dimensions, without being troubled by diffraction. This packet would then, in a force field, travel around in a certain way without appreciable spreading. We know at each instant that the particle is within the packet. Thus for all practical purposes the law of motion of the packet is the same as the law of motion of the particle. This then is the direction in which we look for the derivation of Newtonian mechanics from wave mechanics. We at once see that the motion of a wave packet in mechanics will be more complicated than in optics, for the wave length in mechanics, X = h/p, changes continuously from place to place. If we have a conservative motion, for which alone it is easy to formu- late wave mechanics, we have p 2 /2m + V = E, X = h/p = h/y/2m{E â€” V), a function of position on account of V. E stays constant, as usual, so that the frequency is constant, as in optics. But the variable X corresponds to a variable index of refraction. There are only a few optical cases where this is true. Generally the index change's sharply from one medium to another, and the ray of light consists of segments of straight WAVES, RAYS, AND WAVE MECHANICS 339 lines. In refraction by the atmosphere, however, as in astron- omy, or in the refraction by heated air over the surface of the earth, as in mirages, the path of the light rays is curved instead of sharply bent, and this corresponds to the usual mechanical case, where the paths or orbits are curved. To proceed further with the connection between wave mechanics and Newtonian mechanics, we must first investigate the shape of a ray in a case where the index changes with position. The general principle governing this is called Fermat's principle. 207. Fermat's Principle. â€” Assume that we have an optical system, with a ray traveling from Pi to P 2 . We may start the ray by letting parallel light fall on a pinhole, so that really the light travels in a narrow beam, eventually reaching P 2 . We assume that the dimensions are so large that diffraction can be neglected. Then suppose we compute the time taken for light to pass from the point Pi to P 2 along the actual ray. This Jrp* fa â€” > where the integral is a line integral, com- Pi v puted along the ray from Pi to P 2 , ds is the element of length along the ray, and v is the velocity, a function of position if the index of refraction changes from point to point. Next, suppose that we compute the same integral for other paths joining P x and P 2 , but differing in between. Since in general the integral is not independent of path, we shall get different answers. In general, if we go from one path to another, the difference of the integral between the paths will be of the same order of small quantities as the displacement of the path. But Fermat's principle says that if one path is the correct ray, and the other is slightly displaced from it, the difference in the integral is of a higher order of small quantities. This is a sort of condi- tion met in the calculus of variations. In that subject we have J'Â»i > 2 ^ â€” is the variation Pl V of the integral, and it means the difference between the integral over one path, and over another infinitely near to it. Fermat's principle says that the variation of the integral is zero for the actual path; meaning that the actual variation is infinitesimal of a higher order than the variation of path, so that it vanishes in the limit of small variation of path. The idea of the varia- tion of an integral is closely analogous to that of the differential of a function in ordinary calculus. Thus, if the variation of an 340 INTRODUCTION TO THEORETICAL PHYSICS integral is zero, for a given path, that means that the integral itself is a maximum or minimum with respect to variations of path; or, more generally, that it is stationary, not changing with small variations of path. Set- ting the variation equal to zero corresponds to setting the deriv- ative of a function equal to zero in calculus. Let us verify Fermat's prin- ciple in two simple cases. First, we assume that v is every- where constant, so that there are no mirrors or lenses. Then we can take v outside the integral, dividing through by it, and having 5 1 ds = 0. That JPi is, the true path of light between Pi and P 2 is that line which has minimum (or maximum) length, and j oins Pi and P 2 . Obviously the minimum is desired in this case; and the shortest line between Pi and Pi is a straight line, which then is the ray. Let us compute the variation of path, to check the variation principle. In Fig. 59 (a), we show the straight line joining Pi and P 2 , and also a varied path, Pi#P 2 . The length of this second path is (Â«) Fig. 59. â€” Variation of length of path. (a) The straight line P1AP2 differs in length from the varied path P1BP2 by a small quantity of the order of the square of AB. (b) The broken line P1AP2 differs from P1BP2 by a quantity of the order of AB itself. Hence the straight line of (a), rather than the broken one of (0), is the one for which the variation of length is zero. 2V(PiAy + (AB)" = 2(iM) 1 + i (AB) 2 2 (PiAY + = (PiP.) _j_ 2 ,p p v differing from the direct path P X P 2 by an infinitesimal (P1P2) of the second order, if (AB), the deviation of the path P X BP 2 from P1AP1, is regarded as small of the first order. In other words, the path PivlP 2 satisfies the condition that the variation of its length is zero (that is, small of the second order). On the other hand, if we started with a crooked path, as P1AP2 in (b), then the path PiPP 2 differs from it approximately by the amount (5(7) _|_ (BD), or approximately 2 (AB) sin 6, an infinitesimal WAVES, RAYS, AND WAVE MECHANICS 341 of the same order as (AB), so that in this case the variation is not zero, and the crooked path is not the correct one. As a second example, we take the case of reflection. In Fig. 60, consider the path PiAP 2 , connecting P x and P 2 , satisfying the law of reflection on the mirror OA. This path evidently equals PiAP 2 in length, where Pi' is the image of P lm Similarly a slightly different path PiPP 2 equals Pi'PP 2 , which is therefore longer, since PiAP 2 is the straight line connecting P/ and P 2 . In other words, PiAP 2 makes the integral a minimum, and is the correct path. In this case we *? could again easily show that the integral along PiBP 2 differed from that along P1AP2 by quantities in the square of AB, verifying our statement that if the path is displaced by small quantities of the first order (AB) the integral is changed only in the second order (AB 2 ). A similar proof can be carried through for the case of refrac- tion, showing that the law of jf refraction is given by Fermat's fig. 60.â€” Fermat's principle for principle. reflection. The path P1AP2, equal to . . Pi'APi, differs in length from its neigh- A lundamental proof Of bor PiBPz by a small quantity of the Fermat's principle can be given order of the square of AB - directly from the determination of the ray from diffraction theory. The condition that a point P 2 lie in the ray, if we discuss diffraction through the aperture by Huygens' principle as in the last chapter, is that the various paths leading from Pi to P 2 , by going to various points of the aperture, and then being scattered in Huygens' wavelets from there to P 2 , should be approximately the same, so that the light can interfere constructively at P 2 . This means that such paths, as measured in wave lengths, are all approxi- mately the same length. In other words, for constructive inter- C P3 ds ference, I â€” , the number of wave lengths between Pi and P 2 , JPi a C Pi ds must be independent of slight variations in the path, or 8 I â€” = 0. JPi ^ This clearly is the condition whether X is independent of position >â– 342 INTRODUCTION TO THEORETICAL PHYSICS or not, for, even if the waves change in length from point to point, we must still have the waves interfere to get the ray, and this still demands the same number of wave lengths along neighboring paths. Now X = v/v, and since v, the frequency, is a constant throughout the path of the light, we may then write the varia- C Pi ds tion as v 8 I â€” = 0, from which, dividing by v, we have Fermat's JPi v principle. This interpretation in terms of the interference of the waves along the ray is the fundamental meaning of Fermat's principle. 208. The Motion of Particles and the Principle of Least Action. We shall now show that if we use the analogue to Fermat's prin- ciple in mechanics, it leads to the correct motion of the particle according to Newtonian mechanics. As we have seen, the wave problem representing the motion of a single particle whose vari- ables we know is a ray. And the path of this ray is given by Fermat's principle, which we may write in the form 8jds/\ = 0. But now in wave mechanics, h/\ = p, the momentum, so that, canceling out the constant factor h, this becomes Sjp ds = 0. But this is a well-known equation of ordinary mechanics: the integral jp ds, or Jp dq, if q is the coordinate in a one-dimensional motion, is called the action, and the principle Sfp dq = 0, showing that the action is a maximum or more often a minimum, is called the principle of least action. And by the calculus of variations we can show that the principle of least action leads to Lagrange's equations, as the equations giving the motion of a particle which obeys the principle. This principle, or a closely related one called Hamilton's principle, also stated in terms of the calculus of varia- tions, is often considered a fundamental formulation of the whole of mechanics, more fundamental than Newton's laws of motion, since these, in the form of Lagrange's equations, follow from it. As a matter of fact, the derivation of Lagrange's equations from the variation principle is the simplest way of deriving them, for one familiar with the calculus of variations, and leads to the equations directly in any arbitrary coordinate system. But here we have gone even farther : we have sketched the derivation of the principle of least action from wave mechanics, as the law giving the shape of a ray, determined from interference of the waves. As we see from this, wave mechanics is the fundamental branch of mechanics, and ordinary Newtonian mechanics, the mechanics of particles, is derived from it. WAVES, RAYS, AND WAVE MECHANICS 343 Problems 1. Assume in Fig. 61 that POP' is the path of the optically correct ray passing from one medium into a second one of different refractive index. Prove Fermat's principle for this case, showing that the time for the ray to pass along a slightly different path, as PAP', differs from that along POP' by a small quantity of higher order than the distance AO. The figure is drawn so that AB, CO, are arcs of circles with centers at P and P', respectively, and it is to be noted that for small AO, the figures AOB, AOC, are almost exactly right triangles. Fig. 61. â€” Fermat's principle for refraction. 2. An electron of charge e = 4.774 X lO" 10 electrostatic units falls through a -difference of potential of V volts (1 volt = 1/300 e.s.u.) and bombards a target, converting all its energy into radiation, which travels out as one photon. Using the relations that the energy of the photon = hv, v = c/\, where c, the velocity of fight, is 3 X 10 10 cm. per second, find the wave length of the resulting radiation. Find the number of volts neces- sary to produce visible light of wave length 5,000 A. (1 A. is 10 -8 cm.); x-rays of wave length 1 A. ; gamma rays of wave length 0.001 A. 3. Assume that light falls on a metal and ejects photoelectrons, the energy required to pull an electron through the surface being at least 2 volts. Find the photoelectric threshold frequency, the longest wave length which can eject electrons, remembering that the long wave lengths have small photons which have not enough energy. Discuss the effect of work function (the energy required to pull the electron out) on photoelectric threshold. 4. Newtonian mechanics becomes inaccurate when the wave length of the particle becomes of the same order of magnitude as the dimensions involved. Consider the accuracy of Newtonian mechanics in the problem of an electron 344 INTRODUCTION TO THEORETICAL PHYSICS in an atom. Assume for purposes of calculation that the electron moves in a circular orbit of radius 0.5 A., with an angular momentum h/2ir (deter- mine its speed, and hence wave length, from this fact). 5. Consider as in Prob. 4 the accuracy of Newtonian mechanics for a hydrogen atom in a hydrogen molecule. The hydrogen atom weighs about 1,800 times as much as an electron. Assume the speed of the atom to be such that its energy is the mean kinetic energy of a one-dimensional oscil- lator in temperature equilibrium at temperature 300Â° abs., or }ikT, where k = 1.31 X 10 -16 , T is the absolute temperature. Compare the wave length with the amplitude of oscillation of the atom. To find this, assume that it oscillates with simple harmonic motion, and that its frequency of oscillation is 3,000 cm" 1 . (The unit of frequency, cm -1 , is the frequency associated with a wave length of 1 cm.) Knowing the energy, mass, and total energy, it is then possible to find the amplitude. 6. Consider, as in Prob. 5, the same hydrogen molecule at 10Â° abs.; an atom of atomic weight 100, in a diatomic molecule of two like atoms, similar to the hydrogen molecule, with the same restoring force acting between the atoms (therefore with a much slower speed of vibration, on account of the larger mass), at 300Â° abs.; at 10Â° abs. 7. Consider whether the uncertainty principle is important in phenomena of astronomical magnitude. Assume a body of the mass of the earth (found from its radius of 4,000 miles, mean density 5.5), moving with a speed of 20 km. per second. Now a measurement of the position is considered, in wave mechanics, to introduce an uncertainty in the velocity, determined in terms of the uncertainty in the measurement of position by the relation ApAq = h. Suppose that the position of the body was determined in space with an error of only 1 m. (a much greater accuracy, of course, than could be really obtained). Find the corresponding uncertainty in momentum, and the angle 9 through which the path is deviated by the measurement. Find how far from its original path the deviation would carry the body in a year. 8. Conjugate foci in optics are points connected by an infinite number of possible correct paths. Thus by Fermat's principle the optical path, or length of time taken to traverse the ray, is stationary for each of these paths, meaning that the optical path is the same for each. Discuss this, showing that for the conjugate foci of a simple lens the optical path is the same for each ray, carrying out the actual calculation of time. 9. Using the properties of conjugate foci mentioned in Prob. 8, prove that if a hollow ellipsoid of revolution is silvered, to form a mirror, the foci of the ellipsoid are optical conjugate foci. Prove that a paraboloidal mirror forms a perfect image of a parallel plane wave coming along its axis. CHAPTER XXIX SCHRODINGER'S EQUATION IN ONE DIMENSION The mathematical treatment of wave mechanics starts with a wave equation, similar to those of mechanical vibrations or of light. We shall not try to derive this equation from more fundamental principles, as we derive the equation of mechanical vibration from Newton's equations, or the wave equation of optics from Maxwell's equations; there are some ways of stating wave mechanics apparently somewhat more fundamental than .the wave equation, but they are not the best methods to start one's study with. We shall thus commence by postulating the wave equation, though arriving at its form by analogy with other cases. In this chapter we take only the form not involving the time, since this has a close analogy to optics. The form including the time is more remarkable, in that it involves com- plex quantities explicitly in its statement. We shall later treat it, separate variables in it, and show that the part inde- pendent of time is the equation treated in this chapter. This equation was first given by Schrodinger, and is called Schrodin- ger's equation. As we recall, the index of refraction, and wave length, of the waves vary from point to point. This means that the differential equation is very much like that of the nonuniform string, which we discussed in Chap. XIV. We shall be able to use the same approximate solution developed for that problem. We shall also get the condition for stationary waves, corresponding to the string held at both ends. This is the so-called quantum condition, and it now determines, not the overtones of a vibrating string, but the energy levels and stationary states of atoms and other systems. The problem, as in the string, leads to expansion in orthogonal functions, and we shall consider this theory in later chapters. 209. Schrodinger' s Equation. â€” The wave equation of optics, after the time is eliminated, can be written v 2 w + (4V 2 /X 2 )w = 0, where u is the displacement. In the mechanical problem, 345 346 INTRODUCTION TO THEORETICAL PHYSICS h/\ = p = momentum. We assume a potential function V (wave mechanics is very difficult to formulate when there is no potential). Then the total kinetic energy is p 2 /2m, so that p 2 /2m + V = E, the total energy, and p = \/2m(E â€” V). Thus we have the equation V*u + ~P(E - V)u = 0, or h 2 87r 2 m V 2 w +Vu = Eu. (1) These are two forms of Schrodinger's equation in the form not involving the time. Suppose that a solution of this equation is u(x, y, z). Then the corresponding solution of the problem involving the time is this times an exponential function of the time. Since the frequency v is E/h, this is e 2wiEt/h u(x, y, z). We note that the differential equation for u, and hence the resulting solution, depend on the energy E, just as the function describing the shape of a vibrating string depends on the frequency. Hence we should properly use a subscript, u E (x, y, z). The general solution would now be a sum of such solutions for all different values of E, ^A E e^ Et ' h u E (x, y, z), (2 ) E as we had a sum of solutions as the general solution for the vibrat- ing string. 210. One -dimensional Motion in Wave Mechanics. â€” For one-dimensional motion, where u is a function of x alone, Schrod- inger's equation becomes g + Â§Â£Â»(* - F)Â« = 0. (3) Since in general V is a function of x, this is an equation very much like that of the string with variable density but constant tension. Just as with that problem, we can easily set up an approximate solution of the problem, if the quantity E â€” V, corresponding to the density, does not change by too large a fraction of itself in, a wave length, though the exact solution is generally difficult, and has been worked out in only a few special cases. The approximate solution is easily shown, by the method used in Chap. XIV, to be SCHRODINGER'S EQUATION IN ONE DIMENSION 347 constant Â±-Â£h<ix where p has the value y/2m{E â€” 7), as before. This method of solution, as applied to wave mechanics, is often known as the Wentzel-Kramers-Brillouin method. It immediately leads to one result of physical interest, when we consider the amplitude of the wave. We have seen in the last chapter that the intensity of the wave measured the probability of finding the particle at the corre- sponding point, just as in optics the intensity of the light-wave measures the probability of finding the photon. Now, if we use the wave function given above, with its complex exponential, we must evidently multiply by its conjugate to get the intensity, or the square of its amplitude: constant T ipdx ^, constant --jfjpo* uu = â€”. . . -e X ,, -e J/E - V \/E - 7 constant constant ,_. = , = (5) VE - 7 V To get the probability that the particle is in a small element of length ds, we must multiply through by ds, obtaining a con- stant X ds/p. But now suppose a particle were moving along the x axis according to the Newtonian mechanics, with the same energy E, in the same potential field 7. The length of time which it would spend in any small element of length ds would be ds/v, or m ds/p. Apart from the arbitrary constant, which could be determined to bring agreement, this is just like the quantum expression. If we knew that the classical particle was moving in this way, but did not know when it started, all we could say would be that the probability of find- ing the particle in a given region at any time was proportional to the length of time which it would have to spend in that region. In other words, our solution, of constant energy, corresponds to a classical particle whose energy is determined but whose initial time of starting is undetermined, and we can find from our wave function the probability of finding it in any region. To the approximation to which the Wentzel-Kramers-Brillouin solution is correct, the classical and quantum probabilities agree exactly, but they do not to a higher approximation. At any rate, however, we can say that the wave function is large in regions 348 INTRODUCTION TO THEORETICAL PHYSICS where the particle is likely to be, or is moving slowly, and is small where the particle is moving rapidly and is unlikely to be. It should be stated that sometimes, instead of the wave function with complex exponential, we use the corresponding real wave function constant 2x , , .â€ž. ,, cos -j- J p dx. (6) ^/E -V h In this case, the probability function has a factor of cos 2 -j- jp dx, introducing a sinusoidal fluctuation of probability which must be ignored in making comparisons with the classical probability. In the preceding paragraph, we have tacitly assumed that the kinetic energy E â€” V was always positive, so that p was real. But in many problems, as we have seen from our discus- sion of classical mechanics, this is true only in limited regions, and outside these regions p becomes imaginary. Even in this case, the method of Wentzel, Kramers, and Brillouin is still formally correct. But there are two physical differences. First, + -JT- Jp dx is now real, so that we have a real exponential, either increasing or decreasing with x, depending on the sign. Secondly, to keep the whole function real, we must make the first factor constant/ -\/V â€” E, which amounts to changing the constant by multiplying by -\f^\. The approximate solution does not hold at all in the neighborhood of the point where the kinetic energy is zero, for there the wave length is infinite, and the assumption that E â€” V changes only a little in the distance of a wave length cannot be true. But we can easily see how to construct an approximate solution in this region, for the differential equation here is simply d 2 u/dx 2 = 0, the equation of a straight line; the actual curve of u against x, as we readily see, has a point of inflection at the point where E = V, being concave downward where the kinetic energy is positive, concave upward where it is negative. We can then take the exponential solution in the region of negative kinetic energy, and the oscillatory one in the region of positive kinetic energy, and join them by a line which is approximately straight. It is obvious, as we see for instance in Fig. 62, that, if we know beforehand the constants of the exponential solution (as for instance the amplitudes of the two terms, one increasing and the other decreasing exponentially, which we must add to get the SCHRODINGER'S EQUATION IN ONE DIMENSION 340 complete solution) the initial value and slope of the sinusoidal solution must be definitely determined to make the two join smoothly. That is, the phase of the sinusoidal solution, or the amplitudes of sine and cosine functions which we add together, Fig. 62. â€” Joining of exponential and sinusoidal functions at point where p = 0. Upper curve shows potential and total energy against x, lower curve shows wave function. The exponential part of the function is so chosen that the amplitude of the term increasing exponentially with decreasing x is zero; otherwise the function would go to infinity instead of asymptotically to zero as x became negatively infinite. are determined. The same thing is true at every such boundary that we cross; if we once determine the two arbitrary constants in one part of the region, the whole function is determined, to make exponential and sinusoidal curves join smoothly. This must naturally be true, since the differential equation is one of the second order, with just two arbitrary constants. 350 INTRODUCTION TO THEORETICAL PHYSICS 211. Boundary Conditions in One -dimensional Motion. â€” Suppose, first, that we consider a mechanical problem where the kinetic energy is always positive. Then there are no regions where the wave function is exponential; it is always sinusoidal, of finite amplitude. For any energy we have two solutions, which, bringing in^the time but writing in exponential form, are constant Jg mÂ± f pdx) ^ â– \/E - V ' K ' of which the real parts represent progressive waves traveling to left or right along the x axis. This corresponds to the fact that the corresponding mechanical particles can travel in either direc- tion, and, as we have seen, the intensity of the wave at any point properly agrees with the probability that the particle should be in that region, as computed classically on the assumption that we do not know when the particle started. Next let us assume that E â€” V remains positive to infinity in one direction, say to the right, but becomes negative to the left of a certain point, say x = x x , as in Big. 62. The solution will then be exponential to the left of x = xi. But in general it will be a linear combination of two exponential functions, one increas- ing exponentially in magnitude to <Â» as x approaches â€” <*> , the other decreasing exponentially to zero. If the amplitude of the former is different from zero, then the intensity of the wave will be infinite at â€” <x> , meaning that the probability of finding the particle at â€” oo is infinitely greater than of finding it anywhere else. This is ordinarily not the physical situation we wish to describe; hence we must assume that the amplitude is zero, and that the solution to the left of x\ has just the one term A I 2w Cx /- m(V-E)dx A , eh J VMlK - Jt ' M , (8) which goes to zero at x = â€” Â°o . But now this comes up to the point x = x x with a definitely determined slope (or rather, the ratio of slope to function, in which the arbitrary constant factor cancels out, is definitely determined). Then there is just a very definite sinusoidal function which joins onto this, as Fig. 62 suggests : the approximate solution for x > xi is given by . sin ( -^ 1 p dx + a). (9) SCHRODINGER'S EQUATION IN ONE DIMENSION 351 It can be shown that for continuity of Eqs. (8) and (9) at x x we must have a = tt/4 = 45 deg. This statement means that the sine curve, instead of having a node at x h has already at that point passed through an eighth of a wave length. It is as if this eighth wave length were stretched out to infinity to form the exponential part of the curve. We have seen, then, that a boundary where E ~ V imposes a definite boundary condition on the solution. In our problem where the motion extends to infinity in one direction, the condi- tion can be always satisfied, by proper choice of phase and ampli- tude of the sinusoidal function, as we have seen. But there are two interesting results, of our calculation. First, the wave in the region where kinetic energy is positive becomes now a real function of position, or correspondingly a real function of time. In other words, it is a standing wave, not a progressive wave. It corresponds to superposed progressive waves traveling with equal intensity in both directions. The progressive wave approaching from the right is reflected at the boundary, and turns back with- out diminution of intensity on the reflection. The mechanical situation is that the particle, approaching the point where kinetic energy is negative, is reflected and turned back, just as it would be in the same problem in classical mechanics. But the other interesting result is that, on account of the exponential terms to the left of x = 0, the particles can slightly penetrate the region where kinetic energy is negative. On account of the rapid dying out of the exponential, this effect is not large, but we shall see in the next section that there can be cases where it is very important physically. This penetration by an exponential wave has an analogy in optics: a wave of light approaching an optically rarer medium at an angle greater than the critical angle is totally reflected, but at the same time, as we have seen in Sec. 168, Chap. XXIII, there is a disturbance, dying out exponentially, in the rarer medium, almost exactly equivalent to what we have here. 212. The Penetration of Barriers. â€” The exponential penetra- tion of particles into the region of negative kinetic energy has as a result that in wave mechanics, unlike classical mechanics, a particle can go from one region of positive kinetic energy to another, even though there is a barrier of negative kinetic energy between. Such barriers are found, for example, in some cases at the surface of a metal, where the electrons in emerging from 352 INTRODUCTION TO THEORETICAL PHYSICS the metal, for example at high temperature in thermionic emis- sion, may find a surface layer of atoms, exerting on them such a strong repulsive force that they would be unable to penetrate on classical mechanics, but can in quantum theory. Suppose that we have a simple barrier of the sort shown in Fig. 63, where the potential has one constant value to the left of Â£ ,'a second high value between x and xi, and a third lower value to the right of xi, and where E â€” V is negative only between x Q and x\. The corresponding problem in metals is that where the region to the left of x Q represents the interior of the metal, that to the right of x\ the space outside, that between x and xi the surface layer or E3 v h. ! E, *i Fig. 63. â€” Potential barrier. The barrier is between xa and xi. Motion with the energy E\ would have a wave function large only to the left of xo, rapidly decreasing to the right of x<>. With the energy Ei, the wave function would be large on both sides of the barrier, small but not zero within it, giving the possi- bility of penetrating the barrier. The wave function of energy E% would be large everywhere. barrier. An electron of low energy, as E h will be confined, except for a small exponential term, to the region to the left of x , or the interior of the metal, and will never escape. An electron of the very high energy E s will be able to escape, either on classical or quantum mechanics. But an electron of intermediate energy E 2 can penetrate the barrier and escape on quantum mechanics, but not in Newtonian mechanics. These electrons of high energy, as Ei or E% y are met only at high temperatures, so that we see the connection with thermionic emission. Consider an electron of energy E 2 , and a solution which to the right of xi is a progressive wave traveling to the right. Then within the barrier we should have a combination of the two kinds of exponential functions, one increasing exponentially to the left, the other decreasing, with amplitudes properly chosen to satisfy the boundary condition of continuity of the function and its slope at x\. These in turn will join onto two progressive waves SCHRODINGER'S EQUATION IN ONE DIMENSION 353 to the left of x , one traveling to the right, one to the left. The final result may be described as follows: An incident progressive wave falling on x from the left; a reflected wave in the region to the left; a transmitted wave to the right of x\, the transmission through the barrier being of the real exponential form. We can tell something without much trouble about the amount trans- mitted. For within the barrier the term Constant -j- J y/2m(V-E) dx \/V - E 6 increasing exponentially to the left, is the important one. And we readily see that its amplitudes at xi and x measure, at least in order of magnitude, the relative amplitudes of transmitted and incident waves. Thus the fraction transmitted depends on the â€” 2tt Cxi , square of the quantity e h * Xo m x . We work out examples of this integral in the problems, showing that there can be barriers of atomic size small enough so that appreciable penetration takes place, though in general this is not true, since a small increase in the height or breadth of a barrier can, on account of the exponen- tial, make an enormous difference in the ease of penetration. 213. Motion in a Finite Region, and the Quantum Condition. â€” Assume next that the kinetic energy is positive only in a finite region, so that classically the motion would be limited to that region. Then there will be a boundary condition on the wave function at each boundary of the region. Just as with the string held at both ends, this condition cannot in general be satisfied; it can be satisfied only for certain energies (corresponding to certain frequencies with the string). Using the approximate method of Wentzel, Kramers, and Brillouin, it is easy to see the nature of this condition. For each boundary must have essen- tially the treatment of Fig. 62, only the exponential decreasing toward infinity being allowed, whereas with an arbitrary energy the exponential would increase toward infinity in at least one direction. We have seen that the exponential part of the curve corresponds to | wave length of the sinusoidal part. The num- ber of wave lengths between x\ and x 2 is I ^ dx. Thus the whole Jx t tl number of waves between â€” <*> and Â°Â° , taking account of the two C X2 v 1 exponential ends, is I Â£ dx + j- Since the function goes to zero 354 INTRODUCTION TO THEORETICAL PHYSICS at both limits, this must be a whole number of half waves, or twice it must be a whole number. Hence i Li dx+ \ 2 1 p dx = ( n + ^jh, n = 0, 1, 2, 2|JÂ«fc + 2- 1,2,3, (10) This is the so-called quantum condition, developed particularly by Sommerfeld. We must remember that, since it is based on the approximation of Wentzel, Kramers, and Brillouin, it is not necessarily an exact condition. In some cases, as the linear oscillator, taken up in Prob. 5, it proves to be exactly true. In other cases, as a particle moving freely between two reflecting walls, as considered in Prob. 10, a similar condition holds, except that the quantum number, which here is {n + |), a half integer, is instead a whole integer. There are still other problems, as the hydrogen atom, in which a modified form of the condition is correct. In most cases, however, the quantum condition gives only an approximation, though a good one. A number of problems can be solved exactly when the motion is confined to a finite region, and it is by comparison with these exact solutions that one can check the method of Wentzel, Kram- ers, and Brillouin, and the quantum conditions. Thus, in Prob. 5 we show that the wave equation for the linear oscillator can be solved as an exponential times a power series. This power series in general diverges for large x, indicating a function which goes to infinity as x becomes infinite. But if we give the energy p dx = (n + \)h, the series breaks off to form a polynomial, and the function goes to zero at infinity. These are the only solutions we can use, and they give just the quantum condition we found before, though by a quite different method. Again, a rotator, a solid of fixed moment of inertia and constant angular momen- tum rotating on an axis in the absence of torques, has a wave function e ~ h , where p, 8 are angular momentum and angular rotation. Since p is constant, the real forms of this are sin (or cos) (+ 2Tp6/h). For this to represent a single-valued function of position, it is necessary, as with the circular membrane, to have the function periodic with period 2x in 6. Thus we must SCHRODINGER'S EQUATION IN ONE DIMENSION 355 have 2tp/Ji = integer = m, giving whole integral quantum numbers in this case, and determining the angular momentum as m h/2w. 214. Motion in Two or More Finite Regions. â€” In classical mechanics, we do not have to discuss specially the case where there are two separated regions where the kinetic energy is positive, separated and bounded by regions where it is negative; the motion occurs in one or the other of these regions, and that is the end of it. But in wave mechanics, the barrier between regions is not entirely impenetrable. We shall not go into the mathematical details of the solution, for, while they involve no new ideas, they are rather tedious. But the result is that the particles can penetrate the barrier and go from one region to the other, just as we have seen in a previous section in consider- ing a barrier between two regions each extending to infinity in one direction. There are some new situations, however. Each region by itself would have stationary states of its own, if the other were not there. But with the two, no one of these states refers to motion wholly in the one region; the particle can go back and forth from one to the other. However, if the energy level is one that is characteristic of the one region and not of the other, the particle spends almost all of its time in that region of which its energy is characteristic. Once in a while it leaks over to the other side, but it soon finds its way back. It may be, however, that a given energy level will be char- acteristic of both regions; this is surely true if they are identical regions. Then the particle will travel back and forth from one to the other, spending equal lengths of time in each. This is an important physical case. For instance, in the hydrogen molecule, both atoms are just .alike, and an electron finds a potential field which has two minima, one at each nucleus. It then can oscillate back and forth, spending half its time about one nucleus, half on the other. These problems are closely analogous to that of coupled oscillators, which we have already taken up. There we found that one oscillator would not move without setting the other into vibration, and similarly here the wave function cannot be large in one region without having a value in the other also. And here we have a special case if the two regions are identical, as we did before if the oscillators were equivalent. We shall find that the whole mathematical treat- ment is closely analogouf , 356 INTRODUCTION TO THEORETICAL PHYSICS We can finally have motion in two regions, one finite, the other reaching to infinity. Then, if the particle starts in the finite range, it is able in time to leak across the boundary, and go off to infinity. The present explanation of radioactivity is based on this idea. An alpha particle is supposed to be held in an atomic nucleus by a restoring force pulling it to a position of equilibrium. But if it were outside, then being positively charged, it would be repelled from the positive nucleus, the repulsion going to zero at infinity. Thus we should have a potential curve as in Fig. 64, where potential is drawn as function of r, the distance Nucleus Fig. 64.â€” Potential curve for radioactive disintegration. A wave function of energy approximately E, starting out as a wave packet within the valley of the potential curve, would gradually leak out through the barriers. from the center of the nucleus to the escaping alpha particle. If now the alpha particle has energy E, and is originally within the nucleus, it will eventually leak out, going off to infinity with a large kinetic energy, as the ejected alpha particles are actually found to have. Problems 1. Prove that the function co nstant e Â±Â£ T^ p dx , where p = \/2m{E - V), \/E - V F V A J, is an approximate solution of Schrodinger's equation, becoming more and more accurate as V changes by a smaller and smaller amount in a wave length. 2. Note that in Bessel's equation for J m , when m > 0, there is a region near the origin where the Wentzel-Kramers-Brillouin approximate solution is exponential rather than sinusoidal. Discuss the solution qualitatively for x < m, where m is fairly large, showing how this solution joins onto the sinusoidal one found in Prob. 9, Chap. XIV. * SCHRODINGER'S EQUATION IN ONE DIMENSION 357 3. Note that the solution of Schrodinger's equation is sinusoidal or expo- nential in a region of constant potential. Discuss the one-dimensional problem of particles going from one region with constant potential Vi to a second region of constant potential V z , when the energy is great enough so that the kinetic energy is positive in both regions. Satisfy boundary condi- tions at the surface, making u and its derivative continuous, joining the two sinusoidal functions together at the boundary. Show that some of the incident particles travel across the boundary, but that some are reflected back, contrary to classical mechanics. Find the fraction reflected. 4. Assuming the potential function of Fig. 63, consider particles striking the barrier from the left with energy E%. Set up the solution, satisfying the boundary conditions at x and xi, and get an expression for the reflection coefficient as a function of the height of the barrier. Show that the reflec- tion coefficient approaches unity if the barrier is infinitely high, or infinitely broad. JI*X2 p dx for an oscillator of natural frequency v, XI energy E, equals E/v. Show that therefore the quantum condition leads to the energy levels E = (n + %)hv for the oscillator. 6. Solve the problem of the linear oscillator of frequency v, where V = 2ir i v 2 mx i . To do this, set 2 * ifnv x , u = e h v(x), and set up the differential equati on for v(x). For convenience, introduce the change of variables y = 2ir\/mv/h x. Solve in series, and show that the resulting series breaks off only if E = (n + %)hv, where n is an integer. 7. Using the series of Prob. 6, investigate the behavior for large x if the series does not break off. Show that for very large x, v approaches the /iir 2 mv series for e h , so that the whole function u increases exponentially with x 2 , and cannot be used as a wave function. 8. Compute and plot wave functions of the linear oscillator corresponding to n = 0, 1, 2, 3, 4. From the graphs find the region in which the solution is oscillatory (that is, the region between the points of inflection). Draw the potential curve and the values of E corresponding to these four stationary states, and show that the motion is oscillatory in the region where the kinetic energy is positive. 9. Set up the approximate solution for the linear oscillator problem by the Wentzel-Kramers-Brillouin method, getting expressions for the functions in both the sinusoidal and the exponential ranges. Investigate to see how well these functions join on at the point of inflection. 10. Compute and plot the approximation of Prob. 9 corresponding to n = 4, and compare with the exact solution. 11. A particle executes one-dimensional motion in a container, having constant potential inside, and with the potential becoming suddenly infinite at the walls, so that the particle never gets out. Show that the boundary condition is that the wave function must be zero at the walls, as the dis- placement of a stretched string is zero at its ends. Find the wave functions of the problem, and find the energy of the particle in the nth stationary state. CHAPTER XXX THE CORRESPONDENCE PRINCIPLE AND STATISTICAL MECHANICS The quantum condition has a close connection with the phase space and the Hamiltonian method, which we have discussed in Chap. IX. Hamiltonian methods have, in fact, been the guiding principle in the development of the quantum theory. At the same time, the phase space is fundamental to statistical mechan- ics, the mathematical foundation of thermodynamics. For that reason, we may profitably treat these subjects together, though, of course, statistical mechanics can be developed entirely from the basis of classical theory. Nevertheless, on account of the essentially statistical nature of the quantum theory, it yields an almost more natural approach to statistical mechanics than is possible in Newtonian mechanics, and by developing the two together we can illustrate the correspondence between classical and quantum mechanics which must hold, since the classical theory is a correct limiting form of quantum theory for large- scale problems. 215. The Quantum Condition in the Phase Space. â€” In Fig. 11, Chap. IX, we show the phase space for a linear oscillator, with a line of constant energy E, an ellipse of semiaxis \^E/2ir 2 mv 2 along the q axis, and s/lmE along the p axis. These quantities measure the maximum coordinate and momentum, respectively, which a particle of E attains during its motion. For such an oscillator, the quantum condition (10), Chap. XXIX, equates twice the integral of pdq between the minimum and maximum q values to (n + %)h. Just as Jy dx measures the area under the curve y (x), so fp dq measures the area under the curve p{q) in the phase space. The integral I 2 p dq is that part of the area of the ellipse above the q axis, and to get the whole integral we double this, obtaining also the integral below the q axis. This may be written as an integral around the contour, .from q x to q 2 around the upper branch of the curve, then back to q\ along the 358 THE CORRESPONDENCE PRINCIPLE 359 lower part of the curve, in which p and dq are both negative, so that we contribute an equal positive term to the integral. In other words, the quantum condition may be written fp dq = (n + %)h, (1) wnere Â£ indicates an integral around the contour. And the physical interpretation is that the quantum integral is the area of the ellipse. Since this is irdb, where a and b are the two semiaxes, it is ir\/2niE\/E/2Tr 2 mv 2 = E/v, giving from Eq. (1) E = (n + %)hv, in agreement with the result of Prob. 5, Chap. XXIX. The results of the last paragraph are general: with any one- dimensional motion the quantum integral fpdq represents the area of phase space enclosed by the path of the representative point, and the quantum condition says that this area is (n + i)h, approximately. If we take successive stationary states, connected with successive quantum numbers n, each will have a curve in phase space, the path of a representative point of the corresponding energy, and the area between successive curves will, by the quantum condition, be h. Thus the phase space is divided up by these paths into a set of cells, each of area h, one for each stationary state. 216. Angle Variables and the Correspondence Principle. â€” We have seen in Ghap. IX, Sec. 59, that a change of variables, called a contact transformation, can be set up, in which the new coordi- nate w increases uniformly with time, and the momentum J stays constant. To visualize this transformation in the case of the oscillator, we may imagine the phase space plotted with such scales of coordinates and momenta that the ellipses of constant energy become concentric circles. Then the new variables are essentially polar coordinates in phase space, the coordinate being the angle divided by 2-tt, the momentum being proportional to the square of the radius, so that obviously the angle variable increases uniformly with time, the momentum staying constant. The momentum /, called the action variable, proves in fact to be precisely the area of the ellipse, or circle, or the same integral fpdq which we meet in the quantum condition. In terms of the action variable, often called the phase integral, we saw that Hamilton's equation dH dw ._. Â«7 " W " ' (2) 360 INTRODUCTION TO THEORETICAL PHYSICS gave the frequency in terms of a simple calculation. This formula permits us to make an extremely interesting connection between the classical frequency of motion of a system and the frequency of the light emitted in a transition between two states of energy E 2 and E\ according to Bohr's frequency condition E 2 - Ei = hv, (3) described in Sec. 201. On the quantum theory, most energies H of the system are not allowed; we may have rather only those satisfying the quantum condition (1). Thus H cannot be regarded as a continuous function of J. We may, however, replace the derivative dH/dJ of (2) by the difference ratio AH/AJ, in which AH is the energy difference between two states, AJ the difference between their phase integrals. If we choose two states whose quantum numbers differ by unity, we have AJ = h, so that the difference ratio is h-~ =V > giving precisely the quantum frequency according to Eq. (3). Hence we have the following relation: the derivative dH/dJ gives the classical frequency of motion of a system; the difference ratio AH/AJ, where the difference of J is one unit, gives the frequency of emitted light according to the quantum theory, or the frequency of oscillation of the oscillator mentioned in Sec. 202. We shall consider later the significance of transi- tions of more than one unit in J. For the oscillator, as one can immediately see from the fact that its energy in the nth state is (n + $)hv, the classical and quantum frequencies are exactly equal, the derivative equaling the difference. This is plain from the fact that here E = Jv, so that the curve of E against J is a straight line, and the ratio of a finite increment in ordinate, divided by a finite increment in abscissa, equals the slope or derivative. But for any other case the curve of E against J is really curved, so that the deriva- tive and difference ratio are different, and classical and quantum frequencies do not agree. Thus in Fig. 65 we show an energy curve for an anharmonic oscillator, in which the tightness of binding decreases with increasing amplitude, the frequency decreases, and therefore the slope decreases with large quantum numbers. Here the classical frequency, as given by the slope of the curve, does not agree with the quantum frequency con- THE CORRESPONDENCE PRINCIPLE 361 nected with the transition indicated, from % to %, for the quantum frequency is the slope of the straight line connecting E 2 and El We may assume, however, that if we go to a very high quantum number, so that we are far out on the axis of abscissas, any ordinary energy curve will become asymptotically fairly smooth and straight, so that the chord and tangent to the curve will more and more nearly coincide. This certainly happens in the important physical applications we shall make. In these cases, we may state Bohr's correspondence principle : jÂ£ h %h %h Fig. 65. â€” Energy curve for anharmonic oscillator. Slope of curve gives classical frequency, slope of straight line connecting E 2 and Ex gives quantum frequency. in the limit of high quantum numbers, the classical and quantum frequencies become equal. This is essentially simply a special case of the general result stated in Chap. XXVIII, that in the limit of small wave lengths (which for most practical purposes is the same as the limit of high quantum numbers) the classical and quantum theories become essentially equivalent. 217. The Quantum Condition for Several Degrees of Freedom. In classical mechanics, we have seen that certain problems, like the two-dimensional oscillator and the central field motion, are separable, so that they can be broken up into several one-dimen- sional motions. Since each of these motions was periodic, the 362 INTRODUCTION TO THEORETICAL PHYSICS whole motion is multiply periodic in these cases. In these partic- ular problems with several degrees of freedom, separation of variables can also be carried out in the quantum theory. In phase space we can pick out the two-dimensional space represent- ing one coordinate and its conjugate momentum, and the projec- tion of the representative point on this plane will trace out a closed curve. There is a quantum condition associated with this coordinate, the area enclosed by the curved path in the two- dimensional space being a half integer times h. Thus we have a quantum number associated with each degree of freedom in such a problem. Further, we can introduce angle and action variables connected with each of the coordinates, just as if each formed a problem of one degree of freedom. The various fre- quencies of the multiple periodicity can be found by differen- tiating the energy with respect to the various J's, and the correspondence principle can be applied to connect these classical frequencies with the quantum frequencies associated with various possible transitions. It can be shown in general that any coordinate of, say, a doubly periodic motion, can be analyzed into a sort of generalized Fourier series in the time, in which terms appear of frequencies where n, r 2 are arbitrary integers. This is the generalization of the ordinary Fourier representation for a purely periodic motion, in which all frequencies nv x will in general appear which are integral multiples of the fundamental frequency. Now we can carry out a general correspondence between any one of these overtone or combination frequencies and a corresponding transi- tion. Thus let us consider the transition in which Ji changes by ti units, J 2 by t 2 units, where Ji and J% are the two action variables. The quantum frequency emitted will be E(Ji, Jt) - E(Ji ~ rji, J 2 - T 2 h) ^ ^ h where E is the energy, written as function of the J's. But if we are allowed to replace differences by derivatives, as we assume we are in the correspondence principle, this becomes l/dE , . BE ,\ , â€ž THE CORRESPONDENCE PRINCIPLE 363 in agreement with Eq. (4), if Vl = dE/dJ h v 2 = dE/dJ 2 . Thus we have a one to one correspondence between all possible over- tone vibrations of the classical motion and all possible quantum transitions. This correspondence is of great importance, for instance, in discussing intensities of radiation, as we shall see later. For each component of the Fourier representation is a sinusoidal vibration, with frequency (4), and a certain ampli- tude A Tl , rx . This oscillation, if it were the oscillation of an electric charge, would send out a radiation of frequency (4), with an intensity proportional to the square of the amplitude, as we have seen in Chap. XXV, where we found a rate of radia- .. 16ir 4 AV _,, , . ^ â€” 3c* * Fourier component A would directly determine the intensity of classical radiation. It then seems very reasonable that, at least in the limit of high quantum num- bers, this intensity would agree approximately with the intensity of the corresponding quantum transition given by Eq. (5). Thus one can derive from correspondence principle definite information about probabilities of quantum transitions, for the rate of radia- tion of energy in a particular transition is proportional to the number of transitions occurring per unit time, or the probability of transition. We shall return to this question in a later chapter. The results which we have mentioned are all for multiply periodic, separable problems in several dimensions. With an n-dimensional problem, and a 2n-dimensional phase space, there are n J's which stay constant during the motion. Thus we may set up n sets of surfaces, J x = constant, J 2 = constant, . . . /â€ž = constant, in the phase space, and the representative point moves so that it stays on an intersection of all n surfaces, or in an n-dimensional region, instead of all through the (2n - 1) dimensional energy surface, as it would in quasi-ergodic motion. The particular surfaces J x = (m + $)h, J 2 = (n 2 + i)h, etc., divide up the phase space into cells, each of which is seen to have the volume h n , at least in simple cases, and a little examination shows that there is just one stationary state per cell. Of course, the path of a representative point is always on an energy surface, and if we take only the quantized J values, the corresponding representative points lie only on the energy surfaces correspond- ing to quantized energy values. In many cases it proves to be true that a number of different stationary states have the same energy. Such a problem is called degenerate, and the number of 364 INTRODUCTION TO THEORETICAL PHYSICS different states connected with the energy level is called the a priori probability of the level. In such a case the volume of phase space between this energy surface and the next adjacent one proves to be h n times the a priori probability. For a quasi-ergodic system, as we have said, there are no quantities like the J's which stay constant, other than the energy. There are still stationary states in the quantized problem, though they are not determined by ordinary quantum conditions. They are derived from solutions of the Schrodinger equation, however, and the boundary conditions lead to definite stationary states, as with one-dimensional motion. Thus we can always introduce energy surfaces in the phase space, corresponding to the quantized states. Generally quasi-ergodic systems are not degenerate, all energy levels being distinct, and the volume of phase space between successive energy levels will always be, at least to an approximation, equal to h n . These relations prove to be of importance in investigating the statistical mechanics of collections of systems in the phase space. 218. Classical Statistical Mechanics in the Phase Space.â€” In Chap. IX we have investigated the motion of a representative point in the phase space. Statistical mechanics, however, like any statistical science, deals not with single points but with an enormous number of individuals, investigating their average behavior. In its applications to thermodynamic problems, there are two principal methods, both of which are frequently used. In the first of these, we deal, for instance, with a gas composed of a great many identical molecules. These molecules them- selves form the individuals whose average properties we investi- gate. Thus the phase space we use is one in which there are enough coordinates and momenta to describe a single molecule. Such a space is often called a n space. The second method is more powerful but more abstruse: the individuals with which we deal are whole systems, as whole samples of gas, and we imagine a large collection, often called an ensemble, or assembly, of such samples, all just alike in such gross properties as volume, tem- perature, and density, but differing in their finer details, as the positions and velocities of individual atoms or molecules. These might represent different pictures of the same gas at different times; or they might represent different repetitions of the same experiment, all controllable conditions being held fixed. Finding averages over such ensembles means then finding the time aver- THE CORRESPONDENCE PRINCIPLE 365 age, or finding the average obtained by repeating the experiment many times. The phase space required for this second method has as many coordinates and momenta as there are in the whole system, a very great number if the system contains many mole- cules. This space is often called the r space. As to the dis- tinction between the methods ,of the /j, and the T spaces, the general situation is that they are equivalent when applied to perfect gases; but if the molecules interact, they can no longer be treated as independent systems and described by separate points in the p, space, but one must instead consider the whole system together, and use the V space. The latter method is then the one which we shall use more often. Both methods are alike, however, in using phase, spaces, and in considering the motion of a swarm of points in such a space. We imagine an ensemble of a great many, or even an infinite number, of points in a phase space. As time goes on, with the points moving, the effect is as of the whole swarm flowing, like a liquid or gas composed of atoms. In fact, many of the ideas of hydrodynamics can be applied in this case, as we shall show in the next section. We introduce -first the density of points as a function of the p's and q's: f(pi . . . p n , qi . . â– q n )dpi . . . dpndqt . . . dq n gives the number of points in the 2n-dimensional volume element dpi . . . dp n dqi . . . dq n . The velocity of points in the phase space is then given by Hamilton's equations, dqi/dt = dH/dpi, dpi/dt = â€”dH/dqi, as we pointed out in Sec. 52, Chap. IX. Thus we have the necessary' quantities to describe the motion of the points as a flow, and in the next section we apply the equation of continuity and investigate its consequences. 219. Liouville's Theorem. â€” Consider the steady flow of a fluid of density p, velocity v. The equation of continuity is dp/dt + div (pv) â€” 0, or dp/dt + p div v + v â€¢ grad p = 0, (6) if the density varies from point to point. We are interested particularly in a divergenceless flow, for which div v = 0, for it turns out that the flow of points in the phase space is of this sort. It is easy to see that this corresponds to the flow of an incom- pressible fluid. For let us find dp/dt, the time ra,te of change of density with time. This is given by dp _ dp dp dx dp dy _ dp * ~ Tt + Tx U + ay H + " at + v grad p ' {7) 366 INTRODUCTION TO THEORETICAL PHYSICS where dp/dt is the rate at which density changes if we follow along with a particle of fluid. But now if div v = 0, Eq. (6) becomes dp/dt = 0, showing that the density following the particle does not change with time, which is to be expected if the fluid is incompressible. This does not imply, however, that the density of the fluid is at all points the same. Let us imagine a fluid composed of large droplets of one sort of fluid suspended in another. If the fluids are chosen so that they do not mix, and the surfaces of separation remain sharp, then the density will change from point to point, as we go from the one fluid to the other. Further, if the whole fluid is moving, the density at any point of space will change with time, as first the one sort of fluid, then the other, will be carried past this point. But if the fluid is incompressible, the density of a particular part of the fluid, as we follow it in its motion, will be constant. That is, v â€¢ grad p and dp/dt are separately different from zero, but their sum vanishes. The situation we have just described holds for the motion of points in the phase space. The 2n-dimensional velocity of points, as we have seen in the last section, has components dqi/dt, dpi/dt, where i goes from 1 to n. Then the analogue to the divergence is div v = â€” ^ + â€” ^ + â€¢ â€¢ â€¢ + â€” ^ + â€¢ â€¢ â€¢ dqi dt dq 2 dt dpi dt 6 dH , d dH , d dH _ n /0 , dqi dpi dq 2 dp 2 dpi dqi Thus on account of Hamilton's equations the flow is divergence- less. Then we see that the flow is an incompressible flow, the density of points remaining constant as we follow a particle. This is Liouville's theorem. 220. Distributions Independent of Time. â€” The principal use of distributions in the phase space is for thermodynamic pur- poses, and here we are interested in thermal equilibrium, and in distributions independent of time. An ensemble independent of time is one for which dp/dt = 0. To get that, we see from Eq. (7) that we must have v â€¢ grad p = 0. This means that the rate of change of density along the direction of flow, or along the streamline, is zero. In other words, all along a single line of flow, or through a single tube of flow of infinitesimal cross section, the density is constant. We may imagine the whole phase space divided up into tubes of flow, and then any distribu- THE CORRESPONDENCE PRINCIPLE 367 tion in which each tube has its own constant density through its whole volume, no matter how this density may change from one tube to another, will be independent of time. ' In a multiply periodic motion, the lines of flow will be given by J i = constant, J 2 = constant, â€¢ â€¢ â€¢ . Thus if we make the density any arbitrary function of the J's, we shall have a distribu- tion independent of time. Remembering that the density is the function /, this is /(Pi * ' * Vn, qx â€¢'â€¢ ' q n ) = F(JiJ 2 â€¢ â€¢ â€¢ JÂ»). (9) On the other hand, in a quasi-ergodic motion, a single line of flow will in time come arbitrarily near to every point of an energy surface. Thus the only distribution which will be independent of time in this case is one in which the density is constant all over an energy surface: /(Pi â€¢ â€¢ â€¢ PÂ», qx â€¢ â€¢ â€¢ <?Â») = F(E). (10) Of course, the ensemble (10) would be independent of time even in a multiply periodic motion, but it is more specialized than is necessary in that case. For instance, in an ensemble of systems each consisting of a particle in central motion, we could make an ensemble in which all parts of the phase space corresponding to the same energy had the same density, and this would be constant. But we could equally well make the density in differ- ent parts of the space corresponding to the same energy but different angular momentum different, and still, as long as the angular momentum was conserved, this distribution would be constant. Any perturbation which involved slow changes of angular momentum, however, would destroy the constancy of this distribution, whereas if we had started with one which depended only on energy, it would not be affected by such a perturbation. The ordinary systems which we deal with thermodynamically are assumed to be so complicated that they are quasi-ergodic. Thus the only type of ensemble independent of time is that of Eq. (10), in which the density is a function of the energy. This is the sort which we shall consider in thermodynamic applications. 221. The Microcanonical Ensemble. â€” A particularly impor- tant ensemble is that called the microcanonical ensemble, in which all the systems of the ensemble have practically the same energy. More precisely, we have 368 INTRODUCTION TO THEORETICAL PHYSICS f(pi â€¢ â– â€¢ Pn, qi â€¢ â€¢ â€¢ ?Â») = F(E) = constant for E Q < E < E + AE = otherwise. (11) It is evident that an arbitrary ensemble can be made up by superposing microcanonical ensembles, the ensemble whose systems lie between E Q and E + AE having a constant density so chosen as to give the proper density in that particular part of the energy space. In thermodynamics the microcanonical ensemble is often used, when we wish to deal with the statistical properties of systems at a given temperature, for energy content is correlated with temperature in such a way that systems of the same temperature have just about the same energy, and therefore are represented at least approximately by a microcanonical ensemble. 222. The Canonical Ensemble. â€” More suitable than the microcanonical ensemble for discussing temperature equilibrium proves to be a slightly different one called the canonical â‚¬insembleÂ« In this distribution the density function is given by / = p(E) = constant e kT , , (12) where E is the energy, k a constant, called Boltzmann's constant, equal to 1.37 X 10 -16 c.g.s. units, T is the absolute temperature. We shall discuss in a later chapter the particular properties of this ensemble, and its advantages. This ensemble has not only the property of remaining unchanged with time, if the system is left to itself, but also of remaining unchanged if the system can interchange energy with another of the same tempera- ture. This is evidently necessary for thermal equilibrium, and the canonical ensemble is the only one in general which has this property. From this ensemble we can derive interesting results, though we mention only a few. We may, for instance, use the fj, space, each system being a molecule. The energy of such a molecule is ~â€”(p x 2 + p v 2 + p* 2 ) + V, so that the probability of finding a molecule having its coordinates and momenta within the limits x and x + dx, y and y + dy, z and 2 + dz, p x0 and p x0 + dp x , etc., is proportional to IT dxdydzdpxdpydpz. (13) THE CORRESPONDENCE PRINCIPLE 369 This law is ordinarily called the Maxwell-Boltzmann distribu- tion law. From it we can easily find that the velocities are distributed according to Maxwell's distribution of velocities, and that the density in ordinary space at different points is proportional to e~ v / kT . We leave these proofs for problems. If on the other hand we use the r space, E represents the energy of the whole sample of gas, and we can prove easily that the energy of an individual sample in the ensemble is very nearly the same as that of any other sample. Thus for such a system the canonical ensemble is very similar to the microcanonical ensemble. One gets the same thermodynamic results using either ensemble, but the canonical ensemble is both more correct theoretically and decidedly simpler for most of its applications. 223. The Quantum Theory and the Phase Space.â€” In Sec. 210, Chap. XXIX, we have seen that a stationary state of a one- electron problem corresponds to a classical particle whose energy is determined, but whose initial time of starting is undeter- mined. More accurately, it corresponds to an ensemble of particles, all of the same energy, but with phases distributed in such a way that the properties of the ensemble are independent of time. This, however, is exactly a microcanonical ensemble. This may be connected with the uncertainty principle for energy, Eq. (2) of Chap. XXVIII, which states that the uncertainty of energy multiplied by the uncertainty of time is equal to h. If then we set up an ensemble of particles all of exactly the same energy, it must necessarily be true that the uncertainty of time of one of the particles is infinite. That is, we know nothing at all as to its phase, or the ensemble consists of particles in all possible phases. And since it is a stationary state we are dealing with, nothing depends on time. In other words, with the quantum theory, the mere process or setting up a stationary state automatically sets up a microcanonical ensemble. We need not do that specially, and we need not prove Liouville's theorem to find out how to get ensembles independent of time. In this way the quantum mechanics is more convenient for statistical purposes than classical mechanics. With problems with several variables, the stationary state certainly represents an ensemble independent of time. If the problem is multiply periodic, it will represent an ensemble of states all of the same J values (that is, the same set of quantum numbers), but arbitrary phases. On the other hand, if it is 370 INTRODUCTION TO THEORETICAL PHYSICS quasi-ergodic, it will represent a microcanonical ensemble. And even in a multiply periodic, degenerate case, where there are several stationary states of the same energy, we can always set up a microcanonical ensemble, by combining all the various states of the same energy. Each one of these states will corre- spond to a volume h n of the phase space. Then if the micro- canonical ensemble is to have a constant density of points over a region between two energy surfaces, it will have a definite number of points for each element of volume h n , and hence a constant and equal number of points for each of these substates of the same energy. We may say that in this ensemble the number of systems in any group of substates is proportional to the a priori probability of this group of states; that is, simply proportional to the number of substates in the group. The distribution function f(p x â€¢ pâ€ž, q x â€¢ q n ) for the quantum theory involves us in rather complicated considerations, which we shall take up in the next chapter. The reason is that the probability function which we are given directly is the square of the wave function, #, and that is a function of the coordinates only, giving the probability of finding the coordinates within certain limits, independent of the momenta. In Sec. 210, Chap. XXIX, we have shown that this probability function approximately agrees with that found in classical mechanics. We postpone other comparisons between the quantum and classical distributions. But there is one feature of the quantum distribution function which should be mentioned at the outset. We have spoken above as if one could draw the paths of particles, and set up distribution functions, in the phase space, for the quantum theory as for the classical theory. But this is really not possible, as we can see from the uncertainty principle. This says that the uncertainty in the coordinate of a particle, multiplied by the uncertainty of its momentum, is of the order of magnitude of h. This product of uncertainties is simply an area in phase space. Instead of representing the particle by a sharp point, we can visualize it as a region in phase space, of dimensions Aq and Ap along the two axes. By the uncertainty principle, the area of this region is h. If we had the same thing in a number of dimensions, as n variables, the 2n-dimen- sional volume associated with the uncertain position and momen- tum of the particle or representative point would be h n , just the volume associated with a stationary state. As a result of this THE CORRESPONDENCE PRINCIPLE 371 uncertainty, we must always be cautious about using the ideas of definite paths of representative points in the phase space. It would perhaps be more accurate to think of the paths, and energy surfaces, as having definite thicknesses, as if the point carried along its volume h n , and allowed that to trace out a finite region of phase space. The canonical ensemble can be set up in quantum theory as in classical mechanics. In the classical theory, it is the ensemble in which the number of points per unit volume is proportional to e~ E/kT . In quantum theory, the number of points in volume h n , or the number in a given stationary state, is proportional to e~ E/kT , or this exponential is proportional to the probability of finding a system, chosen at random from the ensemble, in the stationary state in question. If we group together a number of degenerate substates all of energy E, and if there are g of them, so that the a priori probability of the group is g, the number of systems in the group is proportional to ge~ E/kT . Problems 1. Take the problem of a particle executing one-dimensional motion in a container with constant potential inside, but impenetrable walls, as in Prob. 11, Chap. XXIX. Plot the path of the representative point in phase space, find the phase integral, and show that the quantum condition leads to the same stationary states and energy levels that were determined previously, except that it leads to half rather than whole quantum numbers. 2. For the system of Prob. 1, compare (a) the frequency of oscillation of the particle back and forth between the walls, as determined classically by elementary argument; (b) the same frequency as determined by the formula dH/dJ; (c) the emitted frequency on the quantum theory. 3. Draw the phase space for a rotator, as described in Sec. 213, and verify the quantum condition stated there. 4. Apply the correspondence principle to the radiation from a linear oscillator. Show that the Fourier components of the classical motion are zero corresponding to all transitions except those in which the quantum number changes by one unit only. From this one may infer that in the quantum theory only this particular transition can occur, the probability of any other sort of transition being zero. Such a result is called a selection principle. 5. Consider the motion of Prob. 6, Chap. IX, in which a particle executed simple harmonic motion on a rotating turntable. Assume that one quantum number, and phase integral, is associated with the rapid frequency of oscilla- tion, and the other phase integral with the slower frequency of rotation of the turntable. From the Fourier analysis of the x component of motion, show that the only allowed transitions are those in which each quantum number changes by + 1 unit. Show further that both must change together, there being no transitions of one quantum number alone, but that a transi- 372 INTRODUCTION TO THEORETICAL PHYSICS tion of +1 unit in one of the quantum numbers is equally likely to be con- nection either with +1 or â€”1 of the other. 6. Find Maxwell's distribution of velocities, stating that the number of molecules of a gas for which the velocity is between v and v + dv is propor- tional to mv ~2kT dv. To do this, use /x space, assume the Maxwell-Boltzmann distribution law. Consider a fixed point of space, so that x, y, z are constant, and we need only consider the three-dimensional momentum space. Note that the velocity is proportional to the radius Vp* 2 + Vv 2 + Vz 2 in the momentum space. The number of molecules between v and v + dv is then proportional to the density of molecules in the momentum space, which from the Maxwell- Boltzmann law is constant for constant v, times the volume of momentum space between v and v + dv, which can be computed from the ordinary geometrical relations of a sphere. Determine the constant factor in the law so that your formula will give directly the fraction of all molecules in the range dv. 7. Find the mean kinetic energy of a molecule at temperature T. Note that the mean of any quantity F(p, q) is given by J = J^(P, g) f(P, g) dp â€¢ â€¢ â€¢ dq â€¢ â€¢ " ; If(P, Q) dp â€¢ â€¢ â€¢ dq â€¢ â€¢ â€¢ where f(p . . . q . . . ) is the density function in the phase space, and the integration is over all parts of the phase space. Note also that since in this case F depends only on the momentum, the integrals in numerator and denominator can be factored into one integral over the momenta, one over the coordinates, and that the latter cancel out. 8. By integrating over all momenta, show that the space density of molecules in a gas is proportional to e _F / fcr . Apply this to the density of the atmosphere in the earth's gravitational field, assuming constant tem- perature. Find from this the rate of decrease of barometric pressure with altitude, at the earth's surface, assuming a reasonable atmospheric temperature. 9. In the r space, consider a canonical ensemble of N identical molecules, where N is very large. Assume that no forces act. Find the number of systems of the ensemble for which the total energy is between definite limits â– E and E + dE. To do this, note that the energy is proportional to p 2 ^ + p2 yl _|_ . . . pi zN , or the square of the iV-dimensional radius of the momen- tum space, so that the part of the space between E and E + dE is the region between two corresponding hyperspheres. Note that the "volume" of a two-dimensional "sphere" (a circle) is ht 2 ; of a three-dimensional one, Ittt 3 ; of an iV-dimensional one, constant times r N . Also note that the volume j.j- d (volume) , between r and r + dr is t of. ar 10. Show that the fraction of all systems is a canonical ensemble for which energy is between E and E + dE is approximately given by a Gaus- sian error curve, Ae - "^" ^. Find c and a. (Hint: The function found in THE CORRESPONDENCE PRINCIPLE 373 Prob. 9 has a very sharp maximum, to be approximated, by the error curve above. Expand the logarithm of the function in power series about its maximum, a, so that the logarithm equals constant â€” c(E â€” a) 2 + â€¢â– â€¢ â€¢ , where there is no first power term because the expansion is about the maxi- mum, and higher power terms than the second are to be neglected. Then the function is the logarithm of this power series, giving the value above. Show that the third and higher power terms are negligible unless E â€” a is so large that the function itself has sunk to a negligible value.) 11. In the distribution of Prob. 10, show that the mean energy of the systems of the ensemble is just N times the energy of a single molecule, as found in Prob. 7. To get an idea of the range of distribution of energy about this mean, find the energy for which the Gaussian distribution curve falls to half its maximum value. Show that the energy difference between this value and the mean increases proportionally to y/N, but that the percentage deviation of the energy, or the deviation divided by the total energy, goes down as l/\/~N, so that for large N the percentage deviation is extremely small. CHAPTER XXXI MATRICES Suppose we have a problem, like the linear oscillator, in which there are no motions which go to infinity; that is, in which every motion is quantized, so that only discrete energy values are allowed. Let the nth energy value be E n , the corresponding wave function u n . Then a general solution of the wave problem, involving the time, is X Cni 2-riEnt h u n (x, y, z), (1) where we choose the negative exponential for reasons which will appear later. This function will shortly be derived as the solution of a wave equation involving the time, though we have not yet written down that equation. Now let us recall the mean- ing of yp. It is the amplitude of a wave whose intensity gives the probability that the particle be found at a given place at a given time. Since \p is complex, this intensity is given by multiplying by its conjugate; hence tyyp gives the desired proba- bility. Or more precisely, the probability that the particle, at time t, is in the volume element dxdydz, is \fydxdydz. One result appears at once from this: the probability that the particle be somewhere is unity, and this must be the sum of the probabilities that it be in all separate parts of space, or the integral of the probability over all space: JJ7# ^ dx dy dz = 1. (2) Now having the probability, we can proceed to get statistical information about the behavior of the particle. 224. Mean Value of a Function of Coordinates.â€” As we have seen in the last chapter, the first step in a statistical investiga- tion is to find a distribution function. There we were interested in functions of coordinates and momenta of a particle or system, and we had a function f(qi, . . . q n , Pi â€¢ â€¢ â€¢ Pn), such that fdqi . . . dp n gave the number of systems having coordinates 374 MATRICES 375 and momenta in the range dqi . . . dp n . To find the average of any quantity, given such a distribution, we proceed as follows : if the quantity is F{q x . . . p n ), a function of coordinates and momenta, we multiply the function by the fraction of systems having those particular q's and p's, and integrate over all q's f and p's. This fraction is -jj-j - â€” â€” = â€” > so that the result is jf dqi . . . dp n jf dqi â€¢ â€¢ â€¢ dp n ' where we denote the average of F by F, to avoid confusion with the single bar indicating complex conjugates. Similarly in the present case we have a function ^ which is a distribution func- tion as far as coordinates are concerned: fypdxdydz is the probability (directly, since f$ \p dx dy dz = 1) that the particle have coordinates within dx dy dz. Thus if we have a function F(x, y, z) of the coordinates, and wish its mean value, we have F = fF$$dxdy dz = WF^dxdy dz, (4) where we prefer the latter method of writing it because it fits in with formulas which we shall have later. This does not tell us how to find averages of functions of the momenta, such as for example the energy; that is more complicated, and will be discussed in a later section. But we may wish, for instance with an atom or molecule, to find the mean value of the center of gravity, or moment of inertia, or some such function of posi- tion alone, and the formula suffices to determine it. It is now very interesting to substitute our expansion of yp in the expression for a mean value. That gives t = >,c tt c m e h Juâ€ž,F u m dxdydz n,m n.m where by definition F nm = Jw n F u m dx dy dz. The quantities F nm form a two-dimensional array of numbers, of the sort known in mathematics as a matrix, and the individual F nm 's are called matrix components. 225. Physical Meaning of Matrix Components. â€” Suppose we have an electron in an atom, and try to find its electric moment as a function of time; that is, its charge e times the displace- 376 INTRODUCTION TO THEORETICAL PHYSICS ment of the electron, x. In other words, we wish the mean value of ex, which is ex = '2jC n c m e h {ex)Â» m . (b; We observe that in the mean moment the terms depend on time ^i(E â€”E)t through the expression e h , having the frequency (E n â€” E m )/h. But this is just the frequency which by the quantum theory the atom should emit in jumping between the energy levels m and n. Hence we connect this particular matrix compo- nent with this transition. By the correspondence principle, in Sec. 217, Chap. XXX, we have already seen that associated with each transition there is a classical frequency of oscillation, and a corresponding Fourier component of the motion. It can now be shown that this Fourier component, in the limit of large quantum numbers, becomes equal to the matrix component (ex) nm of the electric moment, which appears in Eq. (6). The individual terms of Eq. (6) act like oscillators, radiating energy, and it proves to be true, though it requires a difficult analysis to show it, that the rate of radiation of the oscillator determines exactly the quantum probability of transition. For example, if a matrix component is zero, there will be no radiation of the corresponding frequency, no transitions are possible between the stationary states concerned, and we have what is called a selection principle, a principle selecting out certain transitions which can occur, the rest being forbidden. The matrix components which we have noticed have been those where m and n were different. If we make a scheme of matrix components like F " F\2 Fiz Fa Fa F23 Fi\ . . . (7) we see that the components F lh F 22 , etc., along the principal diagonal all have m = n, so that our components with m 9^ n are just the nondiagonal components. The diagonal compo- nents, however, have a different meaning. They refer to time average properties of the system, rather than to the sinusoidal properties which are connected with radiation. Thus if we take the time average of P (where the averaging in F refers to MATRICES 377 averaging over the probability distribution, not over time), ^{E n -E m )t the exponential term e h averages to zero, unless n = m, in which case it is unity. Hence we have time average of P = ^c n câ€žF n n, (8) n the double sum reducing to a single sum. Here, as we said above, only the diagonal components of the matrix of F appear. We can understand this formula better if we notice the mean- ing of the c's. To get at this, we observe that the c's are the amplitudes by which the various overtones are multiplied, in order to get the whole wave function. Thus the quantities c n c n , the squares of these amplitudes (taking account of the fact that they may be complex by multiplying by the conjugate) are quantities proportional to the intensities of the various overtones ; and the interpretation of this is that they are proportional to the probability that the particle be in a given stationary state. As a matter of fact, we shall soon show that c n c n represents just the probability itself, the sum of all the probabilities, ]Â£Vâ€žcâ€ž, n being unity. Thus the formula time average of F = ^c n cJF nn n means that F nn is the time average of F over the nth stationary state, and c n c n the probability of finding the system in this stationary state, so that we multiply together and add to get the average over all stationary states. 226. Initial Conditions, and Determination of c's. â€” Just as with the problem of the vibrating string, we may have initial conditions: we may know that the distribution \p has a certain value at t = 0. Let us take a specific example: we may know that at t = the particle is inside a given small volume, though we do not know where in that volume. Then we may ask as to its probable later motion. That is, we know that ^(x, y, z, t) is zero, at t = 0, outside the small volume, and has a constant value, ' or at least a sinusoidal form with constant amplitude, inside the volume. Now at t = 0, the exponentials become unity, so that we have ^{x, y, z, 0) = ^c n u n (x, y,z). But this n is just the familiar problem of expanding an arbitrary function 378 INTRODUCTION TO THEORETICAL PHYSICS of x, y, 2 in a series of functions u n . These are orthogonal func- tions; they are solutions of Schrodinger's equation, which is of the type already discussed in Prob. 10, Chap. XV, where we showed in general that the solutions were orthogonal. We assume them to be also normalized. Thus the c's are simply the coefficients of expansion, determined directly by multiplying by the corresponding normal function and integrating. We must be careful. of only one thing: our functions are now often complex, and when we multiply two such functions together, in such cases, it proves to be necessary always to multiply so that a function and a conjugate appear together. Thus we have fffy(x, y, z, 0) u m (x, y, z) dx dy dz = ^ câ€ž ju n u m dx dy dz. n But now the orthogonality is such that ju n u m dx dy dz is unity if n = m, zero if n jÂ£ m, so that we have Cm. = JV Um dV. (9) The physical situation is then this. If we know initially the distribution of coordinates, we can find a \j/ satisfying the conditions, and in general all the c's will be different from zero. That is, all overtones will be excited, or the system will be partly in each stationary state. We may say, if we choose, that we have an ensemble, and that a system of this ensemble has a probability c n c n of being in the nth state. If now we ask how ip changes with time, we can see that the particle will no longer have the initial distribution of probability, but that the probability will change with time. For example, if we originally know it is in a small volume, this will not continue to be true as time goes on; it will have a chance of moving out of the volume. The reason is that the different waves cooperate to give just the right function at t = 0, but they vibrate with different frequencies, and soon they get out of step, and can no longer cooperate properly. Thus a general wave function, made by superposing many stationary states, does not represent an ensem- ble independent of time, though a single wave function does. Though the probabilities as functions of the coordinates change with time, it is significant that the c's, being constants, do not. Thus the probability of finding the atom in a given stationary state does not change. The atoms do not go from one to another, and the states are really stationary. This is all true only MATRICES 379 if we neglect radiation, or external forces. If there is radiation, the whole situation will be altered, the c's will change with time, and the time rate of change of any cc will be interpreted as being connected with a corresponding probability that atoms are having transitions to or from this state. It is much as with vibrating strings: if the string is started off with a complicated shape, this shape will be soon changed, but if there is no friction we can analyze the motion into overtones, and each overtone preserves its amplitude. If friction is present, however, the overtones change their amplitudes. 227. Mean Values of Functions of Momenta. â€” The method of finding mean values of functions of the coordinates is perfectly straightforward, but the treatment of the momenta is peculiar, and is one of the characteristic features of wave mechanics. The momentum shows itself in the wave function through the wave length of the wave, and in order to get information about wave length, it turns out that the proper procedure is to differentiate the wave function. We can find the correct formulas from a very simple case; and since we are setting up a theory which is not derived from any other, we can do nothing but postulate the general formulas, which prove to be the same ones that we find in this special case. Thus suppose we have a free particle in empty space, traveling with a momentum p, energy E. Its wave function, if it travels along the x axis, will be e~h , corresponding to the wave length 1/X = p/h. More generally, if its components of momentum along the three axes are p x , p y , p z , its wave function will be 2iri a plane wave. If we wanted to find the mean x momentum of this particle, we should multiply p x by the probability, and integrate; we should get p x , of course, since the mean value of a constant is the same constant. But the question is, how is this to be general- ized so that it can be used in more complicated cases, where the momentum does not appear explicitly, and is not constant? The answer proves to be the following : If our function is \f/, we observe that prâ€”. â€” equals p x \l/. Thus if we form the expression #:râ€” . ~^- 2m dx 2-ki dx and integrate, the answer will be the same as integrating ^p^, (h a V Kâ€” . â€”J \p F = 380 INTRODUCTION TO THEORETICAL PHYSICS h r) would give p x 2 , and so on. In other words, the operator 7â€” . â€” > 2iri ox operating on \p, and averaged, can be taken to stand for the x component of momentum. It is now assumed that this process can be applied in general. Thus with any wave function ^, the mean value of the x com- Jh d $kâ€”- -5- 1 dv. Or more generally, if we have any function of momenta and coordinates, say F(x, y, z, p x , Pv> Pz), we have for the mean value jV F (x, y, z, Â± Â£ Â£ Ty'h.Tz) + *â€¢ (1 1) This is the general rule, reducing to our former one when F involves only coordinates. There is one difficulty connected with this, however. It turns out that if there are any terms in F involving products of coordinates and momenta^ the answer will depend on the order in which they occur. The best example is the case of the product p x x. We have ^ x = r[L-L^] dv h _ = as + *P" or plx â€” xpl = s - .â€¢ (12) This is the so-called commutation rule ; it states that interchange, or commutation, of the order of a coordinate and momentum operator changes the value, since the difference is not zero. In most actual cases that we meet, we shall not be troubled by this difficulty of noncommutability of coordinates and momenta, but it is something against which we must be on our guard. We notice by analogy with what we have done that, taking the wave function of the form given above, â€” ^â€” . -Â£â– = E\l/. This 2tti dt MATRICES 381 again is taken to be a general method of finding the energy of a wave function: *-/*(-Â£â– !>*â€¢ â€” â€” s t If \j/ = 2jC m e h "u m (x, y, z), we evidently have m 2Â«, "o""--^ = y^rÂ»E m e h u m {x, y,z). 2iri dt m Multiplying by #, we have X r-(Mn-Mm)t_ c n c m E m e h u n u r -*p iB .-H~)t. Integrating over the coordinates, the nondiagonal terms drop out on account of orthogonality of the w's, and the rest reduces to E = ^C n C n E n , (13) n a weighted mean of the energy of the various states. 228. Schrodinger's Equation Including the Time. â€” We are now able to give a more general interpretation of Schrodinger's equation than was possible in Chap. XXIX. We start with the classical expression H{qi â€¢ â€¢ â€¢ Qn, pi â€¢ â€¢ â€¢ Pn) = E, where H is the Hamiltonian function, E is the total energy, and the equation represents the conservation of energy. But now suppose we try to replace each side by the corresponding quan- tum theory expression, so that we shall be able to allow each side to act on xp, and if we wish multiply by # and integrate to get averages. The first step is â€ž/ JlJL JL JL h d \h - h -0t c[Â±\ H\qi â€¢ â€¢ â€¢ QÂ»>2ridqi 2ti dq 2 ' ' ' %ci dqj* 2wi dt' { } But this is just Schrodinger's equation, in the form involving the time (which we have not so far met). To show that it reduces to the form we have previously met, let us take the case of rectan- gular coordinates x, y, z. There # = 2^(P* 2 + p. 2 + p* 2 ) + F, 3S2 INTRODUCTION TO THEORETICAL PHYSICS so that the equation becomes 87r 2 m\d:r 2 "*" dy 2 ~ r dz 2 / "*" _T 2Â« d<" In this, let us assume a solution \p = e h u(x, y, z), where E is a constant, to be identified at the proper time with the energy. Then the equation becomes (-Â£?' + v ) u - Eu - < 15 > which leads immediately to the form of Schrodinger's equation with which we are familiar. 229. Some Theorems Regarding Matrices. â€” Suppose that we have an operator F, formed from a function of the q'a and p's by replacing the p's by differentiations, in the manner we have described. Then we have by definition F nm = JunFu m dv. But we can look at this in the following way. The tin's form a set of orthogonal unit vectors in function space. Fu m is a function different in general from any of the u's, and hence a different vector. The quantity JunFu m dv is the scalar product of Fu m with u n ; that is, it is the component of Fu m along the nth. axis. But this suggests writing a vector equation: Fu m = ^F nm U n , (16) n expressing Fu m as a sum of unit vectors, each multiplied by the corresponding component. To prove this, we need only multiply by u n and integrate, when the right side, on account of orthogonal- ity, leaves only F nm . An example of such an expression is Schrod- inger's equation not involving the time, which can be written Hu n = EnU n , (17) if E n is the energy in the nth state. This obviously expresses the fact that the matrix of H has only diagonal components (H nm = E n if n = w, zero if n 9^ m), so that, since H has no nondiagonal components, it has no terms depending on time, or is a constant. It is interesting to write down the matrix of a constant, foi example a number C. Evidently ' C nm = ju n Cu m dv = C if n = m, if n 9^ m. A particular case is the matrix of unity, ju n u m dv = 1 if n = w, if n 9^ m, simply the orthogonality and normalization conditions. This matrix is often called 8 nm ; by MATRICES 383 definition 8 nm = 1 if n = m, Oifn^ra. In terms of this, we have C nm = C8 nm . And we can write the matrix of the energy as "urn = H'nVnm.' (loj This matrix equation, stating that the matrix of the energy is a diagonal matrix with the characteristic values E n , may be taken as a matrix statement of Schrodinger's equation; we readily see that it is just what would be obtained by multiplying Schrod- inger's equation by an arbitrary u m , and integrating. We shall actually use this matrix equation later in discussing perturbation theory. It is to be noted that a matrix depends on two things: first, the operator, and secondly, the set of orthogonal functions with respect to which it is computed. Thus a given operator, as energy or angular momentum or x coordinate, can have its matrix computed in any set of orthogonal functions. The prob- lem of solving Schrodinger's equation with a given energy opera- tor may be considered as that of finding the particular set of orthogonal functions which makes that operator diagonal. In a similar way we can find a set of orthogonal functions which would make any other desired operator have a diagonal matrix. We shall see in the next chapter that this involves us essentially in a rotation of axes in function space, similar to what we found in introducing normal coordinates in vibration problems. From our expansion of Fu m in series in the wâ€ž's, we can easily get the method of multiplying matrices, which is very useful in matrix manipulation. Suppose that we have two operators F and G, and know the matrix components F nm and G nm . We can then find easily the matrix components of the product operator FG. For we have ' mnvÂ»m Gu n = ^G* m FGu n = 2jG mn Fu m = ^GmnFkmUk ~ 2-A ZjFkmGmn )Uk. m m,k k m But also (FG)u n = ^{FG) kn u k , . k by the earlier formula. Hence *(FG)kn = Z^^^Gmn, â– (19) m the formula for multiplying matrices. 384 INTRODUCTION TO THEORETICAL PHYSICS It is a rather remarkable fact that the method of operating with matrices was discovered before the wave mechanics. This multiplication rule, and the commutation rule, were both devel- oped. They were used for a number of complicated calculations, without use of wave functions, for example for finding the energy- levels of the linear oscillator, its intensities of radiation, and even the energy levels of the hydrogen atom. For a few problems, as perturbation theory, the matrix method is still more convenient than the wave method, as we shall see. Problems 1. Prove that a coordinate commutes with another coordinate; a momen- tum commutes with another momentum; and a coordinate commutes with a momentum conjugate to another coordinate. 2. Write down the operators for the three components of angular momen- tum in rectangular coordinates. h riJ? 3. If F is any operator, prove that ^â€” = (HF - FH), where H is the Hamiltonian operator, the equation above to be regarded as either an operator or matrix equation. To prove it, take average values of the operators. Find the average value of F, differentiate it with respect to time, to get the left side of the expression. On the right, in computing the average values* use the multiplication rule to compute the matrices of HF and FH, noting that H has a diagonal matrix. Finally identify terms on both sides of the equation. 4. Using the result of Prob. 3, prove that the time rate of change of the energy is zero; prove that H and t satisfy a commutation relation like a momentum and coordinate. 6. Show that for the linear oscillator the assumptions E n = (n + h)hv Xnn = U = j Mn + l) MM Sn+l.n - X n , n +1 ~ \ 8ir 2 mÂ»Â» x nm = if m yÂ£ n Â± 1 satisfy the quantum mechanics. To do this, compute the matrix components of x nm , and find the matrix of the energy expression (m/2)(x 2 + 4a- Vx 2 ), computing the matrices of x 2 and x 2 by the multiplication rule. Show that this matrix is diagonal, its diagonal components being the energy values given above. 6. By comparing with the wave functions of the linear oscillator in Chap. XXIV, Prob. 6, verify that the values of matrix components in Prob. 5 are correct. If you cannot give a general proof, take the actual wave func- tions you have worked out, in Prob. 6, Chap. XXIX, using them for n = 0, 1, 2, normalizing, and calculating the matrix components by direct integration. MATRICES 385 7. Show that a linear oscillator radiating from the nth. stationary state cannot jump except to the (Â« â€” l)st state, so that there is a selection principle on its radiation. Compute the rate of radiation of the oscillator in the nth state, on the assumption that it is the same as that of a classical â€” -(E â€” E i)t oscillator whose charge is e, displacement is x n ,n-ie h * " ~ + â€” (.En-1-En)t x n -i, n e h . Compare this displacement with the displacement of a classical oscillator of energy Eâ€ž, showing that in the limit of large quan- tum numbers both amplitude and frequency of the classical oscillator agree with the quantum values. This is an example of the correspondence principle. 8. Solve Schrodinger's equation for a rotator, whose kinetic energy is Â§/0 2 , in the absence of an external force. Find wave functions, showing that the angular momentum is an integral multiple of h/2ir. Compute the matrix of R cos 6, one component of displacement of a point attached to the rotator at a distance R from the axis. Show that all matrix components are zero except those in which the angular momentum changes by + 1 unit. 9. Find what p 2 q â€”. qp* is equal to, using the commutation rule for pq â€” qp. 2*2 10. Show that e h u(x), where p is the Â£ component of momentum, a. is a constant, is equal toÂ«(a; -J- a). Use Taylor's expansion of the exponential operator. 11. Write down Schrodinger's equation in spherical polar coordinates, by using the Laplacian in these coordinates, assuming a potential V(r). Discuss the method of deriving the equation from the Hamiltonian by replacing the momenta by differentiations, showing that the former method is consistent with the latter, but that the latter method does not lead to unique results. CHAPTER XXXII PERTURBATION THEORY There are many problems in wave mechanics, which, though they, cannot be exactly solved, are approximated by soluble problems. Thus a nonlinear oscillator can be approximated by a linear one; or a system, as an atom, in an external electric or magnetic field can be approximated by the same system without the field. The perturbation theory is adapted to the solution of such problems, starting with the known approximate solution, and expanding in power series in the perturbation. At the same time, there are some problems of more general nature treated by perturbation theory. Thus the radiation of an atom can be examined by treating the interaction of the atom and a radiation field as a perturbation. We shall be led by such questions to a discussion of the transitions between stationary states. The actual method we shall use is closely analogous to the perturbation theory used with the nonuniform vibrating string. 230. The Secular Equation of Perturbation Theory â€”Suppose that we wish to solve Schrodinger's equation Hu n = E n u n , where H is the given Hamiltonian. Let us start with a set of orthogonal functions u n Â°, which often are solutions of a similar problem approximating the real one, and let us expand the correct functions u n in series in the u n Â°'s: = %Sr, (1) Then the problem may be regarded as that of finding the expan- sion coefficients S mn , which are really coefficients of a linear transformation in function space transforming from the original set of orthogonal functions to the final, correct, ones, so that we may expect the S'b to satisfy orthogonality and normalization conditions. We substitute this expression for u n in Schrodinger's equation, and get the condition for the coefficients. If we substitute, multiply by u k Â°, and integrate, we shall have only 386 PERTURBATION THEORY 387 one term on the right, on account of orthogonality of the w 0, s; on the left, we shall have a linear combination, each term involv- ing a matrix component of H with respect to the wÂ°'s, for example Hkm = JukÂ° Hu m Â° dv. We recall that, since the uÂ°'s are not solutions of the problem, this matrix will not be diagonal. Carrying out the substitution, we have 7 .(Hkm ~ E n 8km)Smn = 0, (2) m or an infinite set of equations for the infinite set of $ m7t 's. Writ- ing them for the nth stationary state, we have (ffn - E n )S ln + H 12 S 2n + H ls S Sn + â€¢ â€¢ â€¢ = (k = 1) H u Sm + (tf 22 - E n )S 2n + H^Szn + â€¢â– .=() (k = 2) (3) These equations are all homogeneous, of the same sort found whenever we have introduced normal coordinates or rotated axes, as, for example, in discussing coupled systems or the vibrat- ing string. As usual, the equations in general do not have a solution; they have one only if the determinant of coefficients H\\ â€” E n H\2 H\z Hi\ Hi% â€” E n H<lz . . . (4) is zero. This secular equation determines the energy levels. 231. The Power Series Solution. â€” If the uÂ°'s were solutions of the problem, H would have a diagonal matrix, the diagonal terms being the energy levels. Though this is not true,* let us assume that the uÂ°'s are not far from solutions. Then by argu- ments of continuity the nondiagonal terms of H, though not zero, are small, and the diagonal terms, though not exactly the energy values E n of the exact solution, are not far from the correct values. Thus E n is approximately H nn . We assume the problem is nondegenerate, by which we mean that only one state has even approximately this same value. Now let us recall how to expand a determinant. We take products of terms, choosing just one from each row, one from each column. There are Nl ways of doing this, if the determinant has N rows and columns. We give each a sign + or â€” according to its requiring an even or an odd number of interchanges of rows or columns to bring the desired term to the principal diagonal. Finally we add, In this case, since we are dealing with small 388 INTRODUCTION TO THEORETICAL PHYSICS quantities, we look first for the largest product. This is plainly the principal diagonal, for the only large terms are those on the principal diagonal. For a first approximation we may set this equal to zero. It is already factored: (Hn â€” E n )(H 22 â€” E n ) ' â€¢ â€¢ â€” 0. One of the factors must be, to this approxima- tion, zero. Plainly it must be H nn â€” E n , since this is the only term which is even small, assuming the system is nondegenerate. This then is the first order approximation to the energy: E n = H nn , the diagonal component of the matrix of the energy with respect to the approximate wave function. Using the first-order approximation to the energy, we can easily get the corresponding linear transformation and wave functions. If the uÂ°'s were the correct wave functions, we should have S nn = 1, all the other S's = 0. To a first approximation, in the actual case, we may set S nn â€” 1, but regard the other S's as small quantities. Then we have, for example, for the first equation (#11 â€” HnnJSln "f" ' " " + H \ n + * ' ' =0, where the terms we do not write are of a smaller order than those we write. Hence Sin = â€”jj : â€” rjâ€”' (5) rL 11 â€” n nn The other equations are of the same form, so that the approximate wave function is u = u o_ ^ g *^Â° . y 6) For the second approximation to the energy, we must consider further terms in the determinant. We can proceed by analogy with the case of a determinant of two rows and two columns, which we should have if there were only two stationary states to consider. In this case the secular equation would be H\\ â€” E n Hl2 "21 H22 â€” E n This is only a quadratic for E n , and can be immediately solved explicitly: = (Hâ€ž - E n ) (# 22 - E n ) - H 12 H 21 = 0. (7) 2 if 11 + H 22 En = Hn + H^ Â± ^ pn + Hâ€ž y _ ^^^ _ HuHn) Â± V( H " 2 H22 ) 2 + H " H "- W PERTURBATION THEORY 389 This explicit formula is analogous to the formula for the fre- quency of a system of two coupled oscillators, obtained in Eq. 4, Chap. XI. Here as there, if the nondiagonal matrix component #12 of the energy is small, we can expand the radical by the binomial theorem, obtaining without trouble for the two solu- tions as power series in Hu, J? - T7 m H^Ha i . . . . tl 11 â€” Zl22 E 2 = H 22 + t^%- + â€¢ ' â€¢ (9) Zl22 â€” Jtl 11 analogous to Eq. (5) of Chap. XI. Here, as there, the effect of the second-order perturbation terms is to push apart the two levels. Thus the first-order calculation alone gives E\ = H\[. The numerator of the fraction giving the second-order calculation, H12H21, is really a perfect square, for it can be shown that H<n = Hu (similar theorems hold in general for all the matrices of real quantities which we meet). Thus the numerator is positive. If #11 is greater than H22, so that the first-order level 1 lies above 2, the denominator is also positive, so that the level is still further raised by this perturbation. On the other hand, for the other level, the denominator is negative, and the level is further depressed. The exact solution which we have obtained in Eq. (8) is only possible when the secular equation is simple enough to handle algebraically. The approximations (9), however, can be found directly from the secular equation (7). Tims let us consider Ei. We assume that the equation is not degenerate, so that H22 â€” Ei is not a small quantity, and we may divide Eq. (7) by it. Thus we have _ H12H21 tin â€” Jhi â€” -fj pT' â– "22 â€” tii\ Replacing the Ei in the denominator by its value H n , which is correct to the first order, this becomes TP - Tf _L ^ 12 ^ 21 , &1 â€” tin T it : tT~ j till â€” tl22 agreeing with Eq. (9). By a little consideration of the deter- minant, exactly a similar discussion can be given in the general case. And the result proves to be simply E n = H nn ~ / \jj Tj * t (10) r ' tikk â€” tlnn 390 INTRODUCTION TO THEORETICAL PHYSICS in agreement with the special case solved above. It is very rarely that further approximations than we have given are used, for either the energy or the wave function. 232. Perturbation Theory for Degenerate Systems. â€” We shall often meet cases in which the unperturbed problem is degenerate; that is, where the diagonal energies H nn of several states are almost exactly equal to each other. In this case, the power series method evidently does not work; the differences of energy which appear in the denominator of the terms in Eq. (9) or (10) become zero, or very small, and the series diverge and even have infinite terms. If there were only two levels, as in the special case taken up in the last section, we could solve the problem explicitly, not using the power series at all. Thus if Hn â€” # 22 = 0, Eq. (8) gives E = Hn Â± H 12 , (11) an important formula for perturbations of degenerate systems. With a finite number of degenerate levels, we have a secular equation of finite degree, and while we cannot solve it as con- veniently as the quadratic, still we can approximate its solutions, even for the degenerate case where the differences of diagonal energies are smaller than the nondiagonal energy terms. Now it fortunately happens that in many problems in which degen- eracy enters, as in atomic spectra, the levels fall into groups, the energies of all the levels in a group being about the same, but the different groups being well separated in energy. Such groups of levels are the multiplets in atomic spectra. In these cases we first solve the problem of the levels within a group, finding an exact solution for the finite secular equation. This solution gives us not only energy levels, but also coefficients of linear combinations transforming the original wave functions of the group into a new set which has the property that it makes the matrix components of the energy diagonal, with respect to the states of this group. We then use these transformed functions as the starting point of a new perturbation calculation, in which perturbations between adjacent groups are considered. In terms of these transformed functions, the energy will have no nondiagonal components between levels which lie close to each other, in the groups, but only between levels in different groups, at a considerable energy distance apart. Thus we may use the series method of Eq. (10), and the second-order terms will PERTURBATION THEORY 391 be small, since the only terms of the summation for which the denominator is small will have numerators equal to zero. It is to be particularly noted in this discussion that the difficulty in applying the power series method to degenerate systems arises, not on account of any unusual size of the nondiagonal energy components, but on account of the unusually small energy differ- ences between diagonal terms. The method converges only if the nondiagonal component between any two levels is small compared with the difference of diagonal energies of the two terms. This demands that before applying the power series method the nondiagonal terms between degenerate levels be removed, but it imposes no such requirement on the terms between levels of quite different energy. We can see more clearly what is happening from a mechanical analogy. Suppose we have a large number of mechanical oscillators coupled together, all having different natural fre- quencies, except the first two, which have unperturbed fre- quencies exactly, or almost exactly, the same. In considering the interaction, the effect of the two of equal frequency on each other will be large, since each one resonates with the other, but the others will have much less effect. We, therefore, first solve only the interaction of these two resonating oscillators, introducing normal coordinates for them. Then we can proceed with the discussion of the interaction, treating the effect of the other oscillators, not on these two oscillators individually, but on the two normal coordinates representing them. Of course, if there are several groups of degenerate levels, we introduce changes of variables inside each group first, then apply the ordinary perturbation theory. We shall have many examples of degenerate systems in our discussion of atomic structure, where nearly every energy level of an unperturbed atom is degenerate, and is split up by an external perturbing field, as an electric or magnetic field. In more complicated atoms, the perturbing fields come from within the atom itself, being inter- actions of one part on another, producing the multiplet structure. In actual practice, we shall find the study of degenerate systems very important. 233. The Method of Variation of Constants. â€” A slightly different point of view in perturbations is obtained by consider- ing the time variation. Let us expand \p, the correct wave Junction depending on time, in series in the unperturbed functions 392 INTRODUCTION TO THEORETICAL PHYSICS uÂ°: ip = ^\C m (t)u m Â°(x), where the Cs â€” functions of time â€” m would be pure exponentials, c n e h " , if the wÂ°'s were the correct solutions of the problem. Whether correct or not, we can always make the expansion above, for at any instant $ can be expressed in series in the orthogonal functions uÂ°, the coefficients being functions of time. Now let us try to satisfy the equation Hi, = -Â£-.% We have m Multiplying by u k Â° and integrating, the result is If ~ "" T2i km m - ( } m These equations for the time derivatives of the Cs in terms of their instantaneous values are enough to determine the complete solution of the problem. To make connection with the ordinary method, we need only assume C m = S mn e h " , an exponential solution. Then immediately we have, canceling the exponential, and the factor â€” 2iri/h, E n Skn = 7 MkmSr, > mny or exactly the equation we have previously used. In more general cases, however, it is not always possible to make this assumption. An example is that in which the perturbative force depends on the time. 234. External Radiation Field. â€” The most interesting example of the method of variation of constants is the perturbation by an external radiation field, for this actually produces transitions between stationary states. First let us look a moment at the physical side of the problem, so as to understand what we expect to obtain from the calculations. An ordinary radiation field is never exactly sinusoidal; its amplitude, at a given point of space, as function of time, may be analyzed in Fourier series of very long period, as in Sec. 185, Chap. XXV. If the field is approxi- mately monochromatic of frequency v , that means that only PERTURBATION THEORY 393 frequencies in the neighborhood of v will have large amplitudes in the Fourier representation. On the other hand, if it is con- tinuous radiation, as the radiation from hot solids, there will be considerable amplitude in all frequencies, at least over a certain region. We assume the latter case. The electric field in the x direction at a given point will then be V#â€ž cos 2r(vt â€” a v ), V where E v , a v , are amplitude and phase of the component of fre- quency v, and where we have components of frequencies differing by small increments dv = 1/T, where T is the fundamental period. The phases a v of successive components may be treated as being statistically independent of each other; that is, if we take any two components, the chance that the phase angle between them at any instant should have one value is just equal to the chance that it have another value. The values of E v will be treated as functions of v, though a somewhat more general treatment subjects them to probability laws too. Now we are interested in finding p v dv, the energy per unit volume in the frequency range dv. Since one component of the series is asso- ciated with the range dv = 1/T, we can simply find the energy of this component. For the x component of electric field, this is o~[2E p 2 cos 2 2ir(vt â€” a,)], the factor 2 taking account of the mag- netic field as well as the electric field. The time average of this term is E p 2 /(8ir). If we are dealing with radiation having equal intensities in all directions, the mean energy per unit volume associated with x, y, and z coordinates will be equal. Hence we have Pv dv = ^E v \ (13) 235. Einstein's Probability Coefficients. â€” Now suppose a radiation field of the type we have described is allowed to act on an atomic system. Einstein was the first to solve this problem. He assumed that, if the atom is in its rath state, there will be the following probabilities of transition to other states, induced by the radiation field: 1. A probability A mn of radiating spontaneously to each state n which is of lower energy than the rath, with emission of the corresponding photon of frequency v mn > given by E m â€” E n = hv mn . This spontaneous emission corresponds to the ordinary 394 INTRODUCTION TO THEORETICAL PHYSICS radiation of an oscillating dipole in classical electromagnetic theory. 2. A probability B mn pmn of absorbing a photon of frequency v mn from the radiation field, where now the state n has higher energy than m, and of jumping up to the state n. This probabil- ity is proportional to the energy density p mn at the particular frequency v mn in the external radiation field. 3. A probability B mn pmn, where now the nth. state lies below the rath, of emitting a photon of frequency v mn , and falling to the lower state, under action of the radiation. This is called induced or forced emission. Einstein assumed that the following relations held between the A's and B's corresponding to any transition n â€” m, where E m > E n : B mn = B nm , and A mn /B mn = Sirhv 3 mn /c 3 . Assuming these simple laws, he could then give a very elementary deriva- tion of Planck's law of black-body radiation. Let us assume that we have a piece of matter containing many kinds of atoms, so as to have some capable of emitting and absorbing each fre- quency. Consider a particular set of atoms having a lower state 1, an upper state 2, and assume that at temperature T the number of atoms in the upper state is to the number in the lower state as e~ Ei/kT is to e~ El/kT , or the Maxwell-Boltzmann distribu- tion law. Now we ask, what intensity, or energy density, in the external radiation field must we have to be in equilibrium with these atoms? If we can find this for each frequency of radiation, we shall necessarily have the distribution of intensity in radiation in equilibrium with matter at temperature T, which is what Planck's law gives. Let Nz be the number of atoms in the second state, JVi in the first, so that N2/N1 = e -(*^-*i)/*r = e ~hv/kT } w here v is the frequency emitted or absorbed by the atom in its transition. Now we know that the number of atoms leaving, the second state per second is equal to the sum of the following: 1. The number leaving on account of spontaneous radiation, or JV2A21. 2. The number entering on account of absorption from the lower state, or â€” N1B12P12. 3. The number leaving on account of induced emission, or N2B21P12. This sum must be zero, in a steady state where the N's are constant. Hence Ni(An + B21P12) â€” NiBiipn. PERTURBATION THEORY 395 Using the relation between the A's and J5's, this is iV 2 5i 2 (pi2 + ~^-J = NiB 12 pi2. Setting N2/N1 = e~ hv/kT , canceling B 12, and solving for pi 2 , we have Sirhv* 1 P12 = ,3 ghv/kT ^ (14) which is Planck's law of black-body radiation. 236. Method of Deriving the Probability Coefficients.â€” Ein- stein's coefficient A is often derived by analogy with classical theory as follows: In Chap. XXXI we have seen that the matrix components of electric moment are connected with probabilities of radiation. Thus, if the amplitude of the component of moment of the atom corresponding to the transition 2 â€” 1 is C, the corresponding classical rate of radiation is â€” ^ â€” . We can write this component in terms of the matrices as follows: corre- sponding to this frequency, we have the terms (ex) 12 e h ' + (ex) 21 e h = 2(ex)i2 cos 2irvt, where hv = E 2 â€” Ex. Thus C = 2(ex)i2, and the rate of radiation is Q 3 ' But an oc atom with a probability A 21 of radiating a photon of energy hv is radiating on the average at the rate of A 2J1 v per second. Hence we must set this equal to the rate of radiation above, giving _ 64tt 4 M 2 12 ^ c 3 W(ex)\2 Ml Zc^T~ ' Bil ~ A21 ^h? ~ ~ZhT~' (15) The argument given above is hardly a derivation; it is merely suggestive. To get a real derivation of the probabilities, we use the method of perturbations. We shall find, for a reason to be discussed in a later section, that we can only obtain the J3's by this method. We shall assume that at t = the atom is definitely in the mth state; that is, c m Â° = 1, all other cÂ°'s are zero, where the c's are the coefficients in the expansion of the wave function \{/ in terms of the unperturbed stationary states, so that c n c n is the probability of finding the system in the nth state, and the cÂ°'s are the values when t = 0. Then we shall investigate the time variation of the c's by the method of variation of constants, and it will appear that the c's for n different from m increase 396 INTRODUCTION TO THEORETICAL PHYSICS i linearly with time, so long as we consider only small intervals of time and small perturbations, the term c m c m correspondingly decreasing. This we interpret as a definite probability that the system will leave the mth state and go to the nth; in fact we shall find câ€žc n equal exactly to B mn p mn t, as far as the variation is linear with time. By comparing this expression with the derived values of c n c n , we can evaluate the Z?'s directly from per- turbation theory. 237. Application of Perturbation Theory. â€” Let the Hamiltonian of the system without radiation be HÂ°, and assume that the unperturbed problem can be solved exactly: Let the perturbed Hamiltonian be HÂ° â€” ex^E v cos 2it(vt â€” aâ€ž), v the second term representing the potential of the force of the field represented by the summation, on the charge e. Under the action of the perturbation, let the perturbed wave function be ^ = ^.C m (t)u m Â°(x). Our task now is to find the C's. Using m the method of variation of constants, noting that HÂ° has a diagonal matrix, we have dC n _ 27ri^S? u p If ~ â€”Y2j Hn ^ k = --^H nn Â°C n + -J-^(ex) nk C k ^E v cos %r(vt - a v ). k v Now let CÂ» = c n {t)e~~t HnnH , where c n (t) would be constant in the absence of an external field. Writing the field in exponential form, and letting HÂ° nn - H kk Â° = hv nk , this gives dcâ€ž ~dT = ^'2w,v c*2f ,eiKi r t,H " 1 + e ~ M[{v ' v k)t ^^ If the external field were not present, we plainly would have dc n /dt = 0; if there is a small field, the time derivative will be small, or, in other words, the c's will be approximately constant. To a first approximation we may assume on the right side that the c's are exactly constant, having the values cÂ° which they had at t = 0. If this is so, we may integrate directly, obtaining PERTURBATION THEORY 397 C "~ c -Â° = â€¢ W* - c -Â° -X <^ %h L \ v + y Bft / â€ž . /*> â€” 2iri(Â»<â€” v n k)t l\ | ~^( ->.. )} (16) Now let us take the case we have discussed, where at Â£ = we have c TO Â° = 1, all the other c's zero. Then for any n 5^ ra, we have only the single term of the summation above for which k = m. Next we find c n c n . In this, we have a product of two sums over v, which is, therefore, a double sum. Each such term for which we have different frequencies in the two factors has a term e -Â«(Â«v-Â«/) > WQ ich, on account of the random nature of the phases, is as likely to be positive as negative, and on the average cancels. Thus we are left with only the squares of the individual terms, in which the a's drop out. Further, each of these squares has terms whose denominators are respectively + v nm ) 2 , + v nm )( v - v nm ), and - v nm ) 2 . The frequency v nm is so defined that it is positive if the nth state lies above the rath, which we assume to be the case for the moment. When v becomes nearly equal to v nm , the term (v - v nm ) 2 is very small, the term with this as denominator very large. Since v is always positive, it is not possible for the other terms, involving v + v nm in the denominator, to become so large. To an approximation, then, we neglect all terms except the last, obtaining LnCn i-iq /.Hiâ€ž ; â€” i w ^ (v- Vnm y = (ex) 2 nm ^ E 2 [1 - cos 2tt (v - v nm )t] 2h 2 ^-L~" (v-v nm ) " ~1*-2jF- â€” (â€ž - Vnm y ~ ' (17) The formula we have just derived is decidedly significant. It gives essentially the probability that the system will go, in time t, from state ra to state n, under the action of the radiation. For a particular frequency v, this probability is seen to be propor- tional to E 2 ) that is, to the intensity of the incident radiation; and to (ex)\ m , the square of the matrix of the electric moment 398 INTRODUCTION TO THEORETICAL PHYSICS connected with this particular transition, which we should expect. But in addition, there is a dependence on frequency. If we plot c n c n , at time t, against v, the impressed frequency, we get a narrow peak with small side bands, centering at v nm , just like the pattern found in Fraunhofer diffraction. Thus, if the impressed fre- quency is close to the absorption frequency v nm , there will be a large probability of transition, while if it is farther away, the probability will be smaller. If the perturbation acts only for a small time, the band will be broad, indicating that many fre- quencies can cause the transition, but if the time is long enough, practically only the frequency v nm can cause the transition; the absorption curve of the substance, in other words, will have a sharp absorption line corresponding to the various transitions from the state m to other states n, as calculated by the quantum theory. In carrying out the summation over v, it is evident that the essential contributions will come for frequency v very close to v nm . In this region, we may replace E v by its value at v nm , which we have already seen to be given by SE\ nm /8ir = p nm dv. Hence the summation reduces to an integration, OnOn 8T(ex) 2 nm Psin 2 ir(v â€” v nm )t , . a v = â€”3/^ Pnm J {v - Vnm y dv - (18) The integration should properly be taken from v = to infinity. But since the integrand is large only in the immediate neighbor- hood of v = vnmi we shall make a negligible error if we integrate J sin 2 z â– â€” y~ dz, where -00 Z z = ,r( v - Vnm )t. This can be easily evaluated, giving ttH. Thus we have finally o7T {eX) nm . ('IQ') C n Cn â€” qi,2 Pnmt>) V-*-"/ or Bnmpnmt, where B nm is as given before. Thus we have verified our earlier statement regarding the probability coefficients B. A simple variation of the argument applies to states n of lower energy than the state m, resulting in the probability of forced emission, and if we compute c m c m , we find that the number of systems in the wth state decreases at a rate to compensate the increase in the other states. This can be shown easily on general grounds as well as by direct computation, for it can be PERTURB A TION THBOR Y 399 shown that the sum of the quantities c n c n for all states remains constant. 238. Spontaneous Radiation and Coupled Systems. â€” The calculation we have just given did not lead to the probability of spontaneous emission A nm . An attempt might be made to include it by adding to the external force a radiation resistance term, depending on the velocity of the electron, but this method proves not to lead to the right answer. The proper treatment, as a matter of fact, must be sought in a different direction. We treat the radiation field, not as a perturbation, but as part of the system. It is possible to apply the quantum theory directly to the field by itself. For instance, if the radiation is confined in a rectangular box with perfectly reflecting walls, the electromagnetic field inside consists of a set of standing waves, of all the wave lengths allowed for a vibrating solid of the corre- sponding size, and with corresponding frequencies. We can now introduce normal coordinates, each corresponding to one mode of vibration, and the classical equations of motion of these normal coordinates are just like those of a linear oscillator. In a corresponding way, in wave mechanics, we treat these normal coordinates, set up a wave equation for each, and find that each one is quantized, with energy (n + %)hv, where v is the frequency of the wave, n a quantum number associated with this particular mode of vibration. A change of this quantum number by unity corresponds to an increase or decrease of the energy of the radiation field by one unit hv, and this we identify with the creation or destruction of a photon of this energy, by interaction with matter. Next we treat the atomic system just as if the radiation were not present. In this case, the atom will continuously stay in the same stationary state, and similarly the radiation field will always keep the same quantum numbers, meaning that no photons are being created or destroyed. But finally we introduce into the complete system of atoms and radiation a perturbation, corresponding to the potential of the atom in the radiation field (including the vector as well as scalar potential). This couples the two systems together, and under the influence of the perturbation transitions are possible, in which the atoms gain or lose energy in passing between stationary states, and the radiation field loses or gains an equal energy, which appears as destruction or creation of corresponding photons, or decrease 400 INTRODUCTION TO THEORETICAL PHYSICS or increase of the quantum number of the proper normal vibration of the radiation. When the probability of these processes is investigated, by the method of variation of constants, it is found that we obtain not only the probability of forced absorption or emission, Bp, but also the probability of spontaneous emission A. It is not hard by this method to investigate other questions as well, as for instance the breadths of absorption or emission lines â€” the question of just what frequencies of light can interact with a given atomic system. The general result is that, the shorter the life of an atom in either the upper or lower state associated with a transition, the broader the corresponding absorption or emission line. It is interesting to look a little more closely at the sort of perturbation problem we meet in considering spontaneous radiation, for example. Suppose we start with the atom in an excited state, and with no energy in the radiation field. Then, after the transition, the atom will be in its normal state, having lost energy, and the radiation will be in an excited state, having gained the corresponding energy. The total energy of the sys- tem will be the same in either case. Now neither one of these situations is a steady state, for neither one persists indefinitely. Both are approximate steady states, corresponding to the same energy. The perturbation problem, then, is one in perturbations of a degenerate system, having two equal energy levels. We have seen that such a perturbation problem leads to mathematics just like two coupled mechanical systems, as two pendulums, and it is convenient to use the mechanical language in describing what happens. Our present problem is like two pendulums of equal period (corresponding to the equal energy levels), coupled together. If the first pendulum vibrates alone, that corresponds to the state in which the atom is excited; if the second vibrates, it corresponds to the radiation being excited. But neither of these mechanical motions can occur by itself; if we start one pendulum vibrating, in time it comes to rest, and the other takes up all the energy. This corresponds to the fact that the system gradually changes so that the atom is in its normal state, the radiation excited. There is a flaw in our analogy, however: the energy in the mechanical case goes back to the first pendulum, while the atom does not come back to the excited state. The answer to this difficulty is easily given. The radia- tion field actually has not one mode of motion only, but many, PERTURBATION THEORY 401 all of about the same energy, all capable of interacting with the atom. Thus the emitted photon can travel in any direction, and not only that, photons of many different energies, all in the neighborhood of the energy ordinarily emitted by the atom, can interact, on account of the finite breadth of the spectral line. Thus while the situation where the atom is excited, and the radiation is in its normal state, is just one state, there are a great number of states corresponding to the other situation. It is as if our one pendulum corresponding to the excited atom, interacted with a great, or even infinite, number corresponding to the excited radiation. In these circumstances, the mechanical energy originally in the first pendulum would soon become dissipated, scattered through the others, and it will never happen all to come back to the first one, though a little might. Physi- cally, the radiation emitted by the atom travels to a great distance, and is very unlikely ever to find its way back to the atom which sent it out. But if the whole thing is enclosed in a box with reflecting walls, there will be a certain chance, finite though small, that the radiation will be eventually reflected back to the atom and absorbed. One significant feature of the situation is that there are real stationary states for the system of atom plus radiation. This follows directly from the fact that we can solve the perturbation problem. Just as with the coupled pendulums, there are normal coordinates, consisting of combinations of the various separate coordinates. Thus, there is some combination of all the various probabilities of the atom being in various states of excitation and the radiation field being in corresponding states which could persist indefinitely, and is thus a stationary state. The things we ordinarily think of as stationary states are combinations of these, just as the state where one pendulum is excited, the other at rest, is a combination of the two normal coordinates, with definite amplitudes and phases. These are really not stationary states at all, for they change with time. In any such problem, there are two equally good methods of treatment: first, we may use the unperturbed states which physically seem like stationary states, treating the perturbations between them by variation of constants, and so introducing apparent transitions into the problem ; or secondly, we may introduce the real stationary states, by the ordinary perturbation theory, introducing the correct initial conditions, and following what happens as time goes on, 402 INTRODUCTION TO THEORETICAL PHYSICS without having any transitions at all between these real station- ary states. This point of view is very illuminating, for it shows us that the only distinction between stationary states and transi- tions is largely artificial, determined by the original unperturbed wave functions in which we choose to discuss the system. 239. Applications of Coupled Systems to Radioactivity and Electronic Collisions. â€” Many other problems of transitions can be looked at from the same point of view we have just used in discussing radiation. One example is the radioactive disinte- gration, which we have considered in Chap. XXIX, Fig. 64. We might take as approximate stationary states first the discrete states of a particle within the finite depression, second the con- tinuous states of the particle outside. If the barriers were infinitely high, there would be no transitions between them, but if the barrier is finite, we may start with a particle within the nucleus, and consider that it has a certain probability of a transi- tion to a state of equal energy outside the barrier. This could be treated by the perturbation theory of degenerate systems, where we could find the probability of leaking out by variation of constants, or alternatively could get approximations to the actual stationary states of the system. In this case, as with radiation, the probability of the particle coming back and getting back into the nucleus again, though small, is finite, if the system is enclosed in a finite box. Here the stationary states which are combinations of solutions for the discrete and continuous regions are perfectly reasonable and natural, and the more accurate way of solving the problem would be to determine these stationary states by the Wentzel-Kramers-Brillouin method, and build up a wave packet at t = corresponding to having all the distribution inside the nucleus, and asking how this packet spreads out as time goes on, though without change of real stationary states. Another similar problem is that of collisions, either elastic or inelastic. Suppose that an electron collides with an atom, being scattered either without change of energy, or with decrease or increase of energy corresponding to raising or lowering the energy of the atom. We can start with a number of unperturbed, not quite stationary, states: first, the electron approaching the atom, with the atom in its original state; secondly, an electron being scattered, say in a definite direction, or better with some function of angle represented by a spherical harmonic, with its initial energy, the atom being unchanged; thirdly, an electron PERTURBATION THEORY 403 scattered with a decrease of energy corresponding to a transition of the atom, with the atom in the correspondingly excited state; fourthly, an electron scattered with increase of energy, the atom being in a lower state, after what is called a collision of the second kind. All these states have the same energy, so that the pertur- bation problem between them, resulting from the fact that they are not solutions of the problem in the region where the electron is in the atom, is one of transition between systems of the same energy. Here, as before, it is often convenient to proceed by the method of variation of constants, and from this we get the prob- abilities of the various elastic and inelastic impacts. One thing is worth noting in all these problems : in the method of variation of constants, the quantity determining the probability of transi- tion is the nondiagonal matrix component of the perturbing energy between the different approximately stationary states. Thus the calculation resolves itself into a computation of these matrix components, and transitions are likely for which the matrix components are large. In our radiation problem, the matrix components in question were those of the electrical energy, involv- ing directly the matrix components of electric moment of the atom. While the perturbation method can be used for discussing collisions, it is not very accurate, on account of the large perturba- tions which the colliding electron exerts on the atom during the instant of collision. Fortunately, at least in the case of elastic collisions, much better approximation methods are available. As we shall see later, an atom acts on an electron very much like a central field of force, and the problem of the scattering of an electron by a central field is merely the special case of the central field problem, discussed in the next chapter, which we meet if the electron is in a continuous rather than a quantized energy level. By analogy with the results which we shall obtain in Sec. 241, the wave function of an electron in a central field is a product of spherical harmonics of angle, times a certain func- tion of r, and for an electron coming from infinity, this function of r is of the form shown in Fig. 62, satisfying a definite boundary condition at the center of the atorii, but becoming sinusoidal for large values of r. By combining an infinite number of such solutions, all corresponding to the same energy, but with different functions of angles, it can be shown that we can make the result- ant wave at large distances approach a plane wave, representing a stream of electrons traveling in a definite direction. But the 404 INTRODUCTION TO THEORETICAL PHYSICS functions are such that, if the central field is not vanishingly small, it is not possible to build up exactly a plane wave. Instead, there are certain terms left over representing spherical waves traveling outward from the center of force, with amplitudes proportional to 1/r, so that they are negligible compared to the plane waves at sufficiently large distances. These spherical wuves represent the elastically scattered electrons. Twc particularly interesting features of the elastic scattering cai be investigated by the method just described. First, one m&7 find the total intensity in the scattered wave, which can be pro , : !ci to be equal to the total intensity removed from the plane waw by its passage over the atom. This gives the probability that tn electron will be scattered by the atom, and it proves to increase as the atomic number of the atom increases, and to depen i in a complicated way on the speed of the electron. This dependence is so complicated that in some cases, called the Ram- sauer effect, very slow electrons have abnormally small probabil- ities of being scattered, and practically pass through the atom without hindrance. The probability of scattering is often described by defining an effective cross section for the atom, a cross section such that if all electrons striking it were scattered, and all passing around it were not, the probability of scattering would agree with the observed value. Plainly the effective cross section depends on electron velocity and on the nature of the atom. The second interesting feature of elastic scattering is the angular distribution of the scattered electrons, determined by the relative probabilities of scattering with the various spher- ical harmonic functions of angle. This again can show a com- plicated dependence on electron velocity and atomic constitution. Problems 1. Prove that if both unperturbed and perturbed functions, u n Â° and u n , are orthogonal and normalized, the transformation coefficients S mn satisfy the orthogonality and normalization conditions. 2. Show that if we expand the correct wave functions in a series of func- tions which are not exactly orthogonal or normalized, the equations for the transformation coefficients S mn arc 2,(Hkm â€” E n dkm)Smn = 0, where dkm = JukÂ°UmÂ° dv, which now is not diagonal and is not equal to dkm- 3. Consider a degenerate system in which there are two unperturbed wave functions, having equal diagonal energies Hn = H 2i , which are nor- PERTURBATION THEORY 405 malized but not orthogonal to each other, so that /â– UiÂ°W2Â° dv = d i2 ?* 0. Hu + H21 Hu â€” H21 Show that the two energy levels are 4. Show that the two correct wave functions in Prob. 3 are 1 + di2 1 â€” di2 U!Â° + U 2 Â° V2(l +d) H-0 y â€” 7 j respectively. Prove them to be normalized and orthogonal. V2(l â€” d) 6. Solve the problem of a system with two degenerate unperturbed levels of the same energy, by the method of variation of constants. Show that the equations for the time derivatives of the c's can be solved by assuming an exponential or sinusoidal solution. Show that the final solution is a pulsation from one state to the other, the frequency of pulsation being Hu/h. 6. Prove by perturbation theory that the energy levels of a linear oscil- lator are not affected by a constant external field, except in absolute value, all being shifted up or down together. Why should this be expected physically? 7. Find whether a rotator's energy is affected, to the first or higher orders of approximation, by a constant external field in the plane of the rotator. 8. Prove in Einstein's derivation of Planck's radiation law that B 12 = B 2 i, by considering equilibrium in the limiting case of extremely high tempera- ture, noting that in this limit the probability of forced transition is large compared with that of spontaneous transition, on account of the large density of radiation. 9. Prove directly from Schrodinger's equation that the sum ^^c n c n always n remains constant. 10. For the problem of interaction of atoms and radiation, when the atom starts in the wth state, work out c m c m as a function of time, and show that this, added to the other câ€žcâ€ž's, gives a constant. CHAPTER XXXIII THE HYDROGEN ATOM AND THE CENTRAL FIELD In the preceding chapters we have been discussing the general principles and methods of wave mechanics. We have seen that from wave mechanics one can derive ordinary Newtonian mechan- ics as a special case. But by far the most interesting mechanical problem which demands wave mechanics for its solution is the structure of atoms, molecules, and matter in general. We shall accordingly devote the remaining chapters of this book to the structure of matter. This is a problem which is doubly interest- ing; first, as a most important subject in itself, secondly, as the finest illustration of wave mechanics. 240. The Atom and Its Nucleus. â€” An atom consists of a nucleus, and a number of electrons. All electrons are alike, electrified particles of negative charge â€” e = â€”4.774 X 10 -10 e.s.u., mass of 9.00 X 10 -28 gm. Nuclei are heavier, and posi- tively charged. The charges on nuclei are found in every case to be integral multiples of the charge e. Thus a nucleus may have a charge Ze, where Z is an integer, and in this case Z is called th*e atomic number. If the atom has enough electrons to be electrically neutral, it is obvious that it must have Z elec- trons, so that the atomic number measures both the charge on the nucleus and the number of electrons in the neutral atom. We shall see that this number Z is the determining quantity in fixing the properties of the atoms; if all atoms are tabulated in order of their atomic numbers, they show periodic properties, for reasons which we shall discuss in the next chapter, and this arrangement is called the periodic table of the elements. Of course, the number of electrons on the atom does not always have to be just the atomic number; violent methods, as bombard- ment, can knock electrons off, or in some cases extra electrons can be added, producing positive or negative ions, respectively. We shall see that some elements, the electropositive or alkaline ones, have a tendency to lose electrons, and form positive ions, while the basic elements tend to gain electrons and become nega- 406 THE HYDROGEN ATOM AND THE CENTRAL FIELD 407 tive ions. Atoms often enter chemical compounds as ions, rather than neutral atoms, so that in our study of atomic structure we shall have to speak constantly of ions as well as neutral atoms. The element of atomic number one is hydrogen, the simplest element. Its nucleus is an elementary particle, called the proton, with mass 1,846 times that of the electron. The heavier nuclei appear to be built up from a combination of protons and neutrons, particles of no charge, but of mass approximately equal to that of the proton. There are approximately equal numbers of protons and neutrons in any nucleus, making the atomic weight (the mass of the nucleus, in multiples of the mass of the proton) approximately twice the atomic number, though this rule is far from exact, the heavier atoms containing more neutrons in pro- portion than the light ones. The forces holding the nucleus together are presumably largely forces of attraction between protons and neutrons, more than counterbalancing the repulsions between protons on account of their like electric charges. By the action of these forces, stable structures are produced, disintegrat- ing only in the case of the heavy, radioactive elements, or in the very light elements under heavy bombardment. The theory of the structure of the nucleus is still in a preliminary state, and we shall not consider it; ordinary properties of matter prove to be almost completely independent of the nuclear structure, depend- ing only on its charge and mass, with most properties depending only on its charge, so that two nuclei of the same charge and different masses, called isotopes, exhibit almost identical prop- erties. Such isotopes are of very common occurrence, many ordinary elements being a mixture of several, the chemical atomic weights being weighted means of the weights of the isotopes, explaining why many observed atomic weights are far from whole numbers. 241. The Structure of Hydrogen. â€” The simplest element is hydrogen, with but one electron moving about a single nucleus. Fortunately the problem of its structure, according to wave mechanics, can be exactly solved, and it serves as a model for the more complicated elements. In fact, we have already carried out many of the mathematical steps in problems at one time or another, so that we shall merely have to summarize results here. For generality, we shall treat not merely hydrogen, but the prob- lem of a single electron moving about a nucleus of charge Ze. The first thing we notice is that the nucleus is very heavy, com- 408 INTRODUCTION TO THEORETICAL PHYSICS pared with an electron. Now if we have a single electron and a single nucleus, exerting forces on each other, we find, in wave mechanics as in classical mechanics, that the center of gravity of the system remains fixed, each particle moving about the common center of gravity. But the center of gravity is very close to the nucleus; it divides the vector joining nucleus and electron in the ratio of 1:1,846. Thus the nucleus executes only very slight motions, and practically we can treat it as being fixed, and the electron as moving about a fixed center of attraction. We shall find that this is a very general method in discussing the structure of matter: we first assume all nuclei to be fixed, and discuss the motion of the electrons about them. Only later do we have to take the motions of the nuclei into account. We discuss this more in detail in a later chapter. We have, then, an electron of charge e, mass m, moving in a central field of force. The attractive force of the nucleus has a potential energy â€”(Ze 2 /r). Thus Schrodinger's equation, with the time eliminated, is Hu = [ â€” h-sâ€” V â€”]u = Eu, (1) y Sir 2 m T J v ' f We shall find it convenient in all our atomic problems to introduce at the outset so-called atomic units of distance and energy. The unit of distance is a = /i 2 /4x 2 we 2 , a unit first introduced in Bohr's theory of the hydrogen atom, but which comes into the o present discussion as well. It is equal to 0.53 Angstrom. The unit of energy most convenient to use is 2ir 2 me 4 /h 2 ) though some- times a unit twice as great is used. This is the energy required to ionize a hydrogen atom from its normal state. It is most conveniently stated, not in ergs, but in volt-electrons. A volt- electron by definition is the energy an electron acquires in falling through a difference of potential of 1 volt, or eV = 4.774 X 10 -10 X â€¢5"^ ergs. In terms of this, our fundamental unit of energy is 13.54 volt-electrons. Associated with this energy is a frequency, given by energy = hv, and a wave length, and its reciprocal a wave number, given by 1/X = v/c. The wave number associated with our unit of energy is the so-called Rydberg number, R = 109,737 per centimeter, and the corresponding energy is Rhc. In terms of our atomic units, Schrodinger's equation for hydro- gen can be rewritten, eliminating all the dimensional constants. THE HYDROGEN ATOM AND THE CENTRAL FIELD 409 Thus, if our new distances are the old ones divided by a , the new energy the old divided by Rhc, we easily find that (-v -??)Â« = Â£Â«, (2) where the derivatives are to be taken with respect to the new x, y, z. The coefficient 2 in the potential energy appears in the process of changing variables, the potential energy of two elec- tronic charges being 2/r in these units. Schrodinger's equation can now be solved, in spherical coordi- nates, by separation of variables. Using the results of Chap. XV, Probs. 6 to 8, the equation can be separated, letting u = RQ&, and the differential equations are i d( . je\, r ^ede{ sme d9) + [ , 2Z 1(1 + 1) sin dd\ dd J ' v ' J sin 2 1(1 + 1) - m R = 0, 6 = 0, |? + "> ! * = Â°. (3) The solutions of the second and third are 9 = Pf 1 (cos 6), $ = e Â±iwl *, or cos m<f> or sin m<f> } where m must be an integer in order to have the function single-valued as far as <t> is concerned, and I must be an integer in order not to have the function P become infinite for cos = 1. The P's are called associated spherical harmonics, and are given by Pi m (cos 0) = sin lml 0(A o + Ai cos + A 2 cos 2 + â€¢ â€¢ â€¢ ), (fc + M - l)(Jc + [ml - 2) -1(1 + 1) f A k - A k ^â€” k(k _ 1} (4) For integral Vs, this series breaks off, the last nonvanishing term being for A; = I â€” \m\. For even I â€” \m\, the expansion is in even powers, and for odd I â€” \m\ in odd powers. The functions R are discussed in Prob. 3. We use a simple transformation of the dependent variable, y = rR. The equation in this variable is 3f + [- + ?-Â«^>-a (5) The solution is y = e- r ^^r l+1 (A Q + A ir + A 2 r 2 +â€¢â€¢â– ), A - -2A Z-Q + k)V=E Ak " ZAk -\i + k)(l + k + l)-l(l + 1)' w 410 INTRODUCTION TO THEORETICAL PHYSICS This series breaks off if E = â€” Z 2 /n 2 , where n is an integer. A simple discussion shows that if it does not break off, the result- ing infinite series becomes infinite as r becomes infinite like e 2r ^~^ f so that y becomes infinite, and is not admissible as a wave function for a stationary state. We therefore limit our- selves to integral n's, and n is called the principal or total quan- tum number, determining the energy. In terms of it, we have __rZ_ y = e n r i+i( Ao + a i7 . + . . . + Ar^-i-ir"- 1 - 1 ), . _2Z n-l-k Ak n Ak ~\l + k)(l + k + 1) -1(1 + 1)' {J) From this recursion formula, we see that I cannot be greater than n â€” 1, in order to have any terms to the series; and from the earlier recursion formula for the function of 0, \m\ cannot be greater than I. The principal quantum number n, and the so- called azimuthal quantum number I must both be positive,* the smallest allowable value of n being 1 and of I zero. The so-called magnetic quantum number m, however, can be positive, negative, or zero, so long as its magnitude falls within the allowed limits. 242. Discussion of the Function of r for Hydrogen. â€” Though we have an exact solution for hydrogen, a qualitative discussion is still desirable, using the method of energy. In Chap. VII we have already discussed motion in a central field in classical mechanics. We have seen that the motion along the radius is like a one-dimensional oscillation, in a potential field V + p 2 /2mr 2 , where V is the potential energy, p the angular momen- tum. In our case, the differential equation for y is like a one- dimensional wave mechanical problem with a potential, in . â– .. * 2Z , 1(1 +1) . .. . Ze 2 atomic units, ot (- - j! â€” ^ â€” -> or in ordinary units â€” 1- 2 â€” ? where p = \/l(l + l)â€” It is thus clear in the first place that the quantum number I determines the angular momen- tum, in units of h/2ic, though the values are not I times this unit, but s/l(l + 1) times it. We shall further discuss the angular momentum later on. Now it is interesting to draw the various potentials, as we do in Fig. 66, where (- , 2 k 2 is plotted, for 1 = 0, 1, 2. We have also plotted 1 â€” ^ THE HYDROGEN ATOM AND THE CENTRAL FIELD 411 indicated by the dotted lines. The reason for this is that in Bohr's theory of hydrogen, it was assumed that the electron moved according to classical mechanics, and that its energy could have only those particular values for which the quantum conditions were fulfilled. He assumed that the angular momen- tum was kh/2T, where k was an integer, so that if we discuss Energy 1 2 3 4 5 6 7 8 r '/9 â€” 11 N "I \ I I \ \^ ^--:_ 1 ^ 1 \ \ """"-â€” i r ~ . â– Va n ^ j __ â€” â€” ^Or/^\Â«=2 \-T> i V â€” ' â€” s \s ,' /Vl / / -1.0 1 / 1 / 1 1 / I// -1 = Fig. 66. â€” Potential and energy levels for hydrogen. TT ,1V 2 _l_ W + 1 } Full lines: 1 5 â€” (potential corrected for centrifugal force, wave mechanics). Dotted lines: 2 k 2 1 â€” - (corrected potential, Bohr theory). Horizontal lines represent energy r r L levels. the classical motion with these dotted potential curves, we shall have precisely Bohr's orbits. He also assumed tfPrdr = fy/2m (E - V â€” p 2 /2mr 2 ) dr = n r h, where p = kh/2ir. The energy levels, either on Bohr's theory or wave mechanics, are â€” 1/n 2 , where on Bohr's theory n = k + n r , and these are drawn, at â€”1, â€” â– Â£-, â€” |, etc. Now consider the particular case k = 1. The lowest possible energy level for this is evidently â€” 1; for here E intersects the potential curve at but one point, giving, therefore, a circular orbit, the perihelion and aphelion distances being equal. As we see from the diagram, the radius 412 INTRODUCTION TO THEORETICAL PHYSICS of the circular orbit is one unit, and the energy minus one unit, explaining, therefore, the origin of the units. But for this same k, higher energy levels are connected with elliptical orbits, as, for example, that for which n = 2,. k = 1, with perihelion smaller, aphelion larger than the circle for n = 1. For n = 2 there is a second Bohr orbit, for k = 2: a circle of radius 4 units. Similarly for n = 3, there are three orbits, for k = 1, 2, 3, and so on, the orbit f or k = n being in each case a circle. This question is discussed in a problem, where it is shown that the i â€¢ 7? J) 97^ orbits are ellipses, of semimajor axis equal to -^ -â€”z â€” - = -=-a , Z Airline* Z and minor axis equal to k/n times the major axis. In the wave mechanics, where the angular momentum has the nonintegral value y/l(l + 1) units, we must use the full lines. Now we are interested in the region where the kinetic energy is positive, not as the only place where motion can occur, but as the region where the wave function is sinusoidal. Out- side this region, it falls off exponentially. We can see a few examples in Fig. 67, in which the first few wave functions are plotted (we plot y, equal to r times the radial part of the wave function). On each function the limits of the region of classi- cal motion are determined by the fact that the points of inflection come here, the tendency of the curves being sinusoidal between the points, exponential outside. It is plain that the wave functions are larger where the electron is likely to be found, small where it is not, as we could prove by deriving the solution from the Wentzel-Kramers-Brillouin method, a possible, though not very convenient, method of discussing the hydrogen prob- lem. As this method would show at once, the wave length and amplitude both become large as r becomes large, and E â€” V becomes small, so that the outermost maximum of the wave function is in all cases the largest, and contributes most to the wave function as a whole. One property of the wave function is evident from Fig. 67: for small r, the behavior is determined mostly by I, for large r mostly by n. This is natural from the fact that for small r the quantity E + â€¢ % â€” - approaches 2 â€” > and for large r it approaches E + - = ^-\ We note that as I becomes smaller and smaller, the region where the wave function is large, or the classical orbit, penetrates THE HYDROGEN ATOM AND THE CENTRAL FIELD 413 414 INTRODUCTION TO THEORETICAL PHYSICS closer and closer to the nucleus. For large r, and, as a matter of fact, for the whole outer maximum, which, as we have seen, is the most important one, a fairly good approximation to the _Zr wave function is simply r n e n , the wave function for the orbit of maximum azimuthal quantum number (I = n â€” 1), corre- sponding to the circular orbit in Bohr's theory. It is interesting to note that this function has its maximum at r = -~ao, just the radius of the corresponding circular orbit in Bohr's theory. 243. The Angular Momentum. â€” We have seen that the quantity y/l(l + 1)^- corresponds to the angular momentum of the orbit. This can be seen by computing the matrix of total angular momentum, or rather of its square, which is more con- venient. We can most easily get the operator for the angular momentum, in spherical coordinates, by an indirect method. Classically, H = p r 2 /2m + p 2 /2mr 2 + V, where p is the total angular momentum. Now in wave mechanics we find the wave equation such that 1 h d/ 2 h d\ 2mr 2 2-iri dr\ 2-wi dr) Lf-LAV- B~Â±\ 1 ( h V d 2 1 nr 2 lsin d 2iri d0\ Sm 2-wi ddj + sin 2 0\27rc'/ d<j> 2 J h d\ 1 / h V â€ž , 2mr By comparison, it is plain that the operator for p 2 is But now from the differential equations for and $, we easily have, using this operator, p 2 u = 1(1 + l)(^) 2 w. (9) That is, p 2 has a diagonal matrix (since p 2 u is a constant times u, without any terms in other characteristic functions), and the diagonal value is Z(Z + \)Qi/2tt) 2 , so that the total angular momen- tum is constant, as it must be in the absence of torques. We can also easily find the component of angular momentum along the z axis. The angular momentum along this axis is the momen- tum conjugate to the angle <f> of rotation about the axis, so that h 8 its operator is ^â€” : â€” -â€¢ Now take the solutions where d> enters Jiirl dq> THE HYDROGEN ATOM AND THE CENTRAL FIELD 415 into the wave function as the exponential, e Â± im4> Then p e u = -â€” : â€” = Â±m 7r -u. This again is diagonal, showing that the 2m d</> 2ir component of angular momentum remains constant. Further, if we use the wave function e iw4> , the component equals m h/2ir. The interpretation of these results is best made in terms of a vector model. Suppose we consider that the angular momen- tum of the orbit is I h/2ir. This will then be regarded as a vector,, normal to the plane of the orbit, pointing in some arbitrary direction in space. The component of angular momentum along the z axis is simply the projec- tion of the vector in that direc- tion. Now we find that this can have only the quantized values m h/2ir. Hence there are only a finite number of possible orientations for the orbit, as shown in Fig. 68, for the states for I â€” 3. Plainly m can go from a maximum of I to a mini- mum of â€” I, or 21 + 1 values in all, just as one finds from the discussion of the spherical harmonics. Now this vector diagram is only suggestive, not strictly true. We see this from the fact that our vector has length I h/2ir, while the actual angular momentum is \/l(l + 1) h/2w. The fundamental reason is that, since the angular momen- tum and its component are exactly given, the uncertainty prin- ciple does not allow us to fix definitely the plane of the orbit, which corresponds to a coordinate. As a matter of fact, the electron in wave mechanics does not move exactly in a plane, but strays outside the plane, as the uncertainty principle would suggest. This is best shown by polar diagrams of the spherical harmonics, plotting the square of the spherical harmonic, which gives the density, as function of angle. This is done in Fig. 69, for I = 1, m = 1 and 0, and I = 2, m = 2, 1, 0. (1 = does not depend on angle.) If we imagine these figures rotated about the axes, we see that for m = I, the figure indicates that most of the Fig. >. â€” Possible orientations of angu- lar momentum, for 1=3. 416 INTRODUCTION TO THEORETICAL PHYSICS density is in the plane normal to the axis, but considerable is out of the plane. For I = 2, m = 1, for instance, the density- lies near a cone, as if the plane of the orbit took up all directions whose normal made the proper angle with the axis. ra = + 1 m = m = Â±2 m = + 1 m = 1=1 1=1 1=2 1=2 1=2 Fig. 69. â€” Dependence of wave functions on angle. O 2 plotted in polar diagram. 244. Series and Selectio^i Principles. â€” All the states for a given value of I and n, but different m, have the same function of r, and the same energy. We shall find that this is still true with an arbitrary central field, so that even in that problem the solution is degenerate. Physically, so long as the angular momentum is determined, it cannot make any difference as far as the energy is concerned which way the orbit is orientated, on account of the spherical symmetry. Thus we often group together the various substates with the same I and n but different m, regarding them as constituting a single degenerate state, with a (2Z -)- 1) fold degeneracy. For hydrogen, the energy as a matter of fact depends only on n, so that all states of the same n but different I values are degenerate, but this is not true in general for a central field. It is convenient, rather, to group all the states of the same I value but different n together to form a series, since they are closely connected physically, having the same functions of angle, while those of the same n merely happen to have the same energy, but without important physical resemblances. The series of different I values are conventionally denoted by letters, derived from spectroscopy. We have the table as shown on page 417. By order of degeneracy we mean simply the number of sub- levels of different m values. The classification into series becomes important when we con- sider the transition probabilities from one level to another. We THE HYDROGEN ATOM AND THE CENTRAL FIELD 417 I value Letter States Order of degeneracy s Is, 2s, 3s, . . . 1 1 V 2p, Sp, 4p, . . . 3 2 d 3d, Ad, . . . 5 3 f 4/, 5/, . . . 7 4 9 5g, . . . 9 recall that these are given by the matrix components of the electric moment between the states in question. When these components are computed, it is found that there are certain selection rules: 1. The component is zero unless the Vb of the two states differ by Â± 1 unit. 2. The component is zero unless the m's differ by or Â±1 unit. The latter rule is easily proved. For, suppose we compute the matrix components of x + iy, x â€” iy, z, which are simple com- binations of x, y, z, the three components of displacement. If we find the matrix components of all three of these to be zero for a given transition, the transition will be forbidden. Now these three quantities, in polar coordinates, are r sin e^, r sin e~ i4> , r cos 0, respectively. If u is RQeâ„¢*, we have (x + iy)u = rR sin 9 e i{m+1)4> , showing that this quantity has a matrix component only to states having the quantum number m + 1, since the quantity on the right could be expanded in series of functions with many values of n and I, but only the one value m + 1. Similarly (x - iy)u = rR sin e*'"^"*, allowing transitions only from m torn â€” 1, and zu = rR cos 6 e im *, allowing only transitions in which m does not change. The proof of the selection principle for I is slightly more difficult, involving the theorem that sin Pi m (cos0) or cos P z m (cos 0) can be expanded in spherical harmonics whose lower index is I + 1 or I â€” 1 only. The selection rules have the following results: If we arrange the series in order, spdf . . . , a level of one series can only have transitions to the immediately adjacent series. This gives us the transitions indicated in Fig. 70 (all of the transitions between upper states are not indicated; merely some of the more important ones down to lower states). The series of lines arising from transitions of the p states to Is is called the principal series; from the s terms to 2p, the sharp series; from the d terms to 2p, the 418 INTRODUCTION TO THEORETICAL PHYSICS diffuse series; from the / terms to 3d, the fundamental series. The letters s, p, d, f are the initials of these series. When the matrix components are worked out, the strongest lines are those in which I decreases by one unit (principal, diffuse, and funda- mental series), and those for which I increases (as the sharp series) are weaker. Of course, on account of the degeneracy in I in hydrogen, the different series are not separated, but they' are in other atoms, and it is for those that the classification is impor- Fig. 70. â€” Energy levels and allowed transitions and series in hydrogen. tant. To see this, we must study the energy levels in the general central field. 245. The General Central Field. â€” We shall find that in discussing atomic structure, we shall wish to consider that each electron moves in a central field, but not an inverse square field. The field is rather the sort which we should have if there were a nucleus of charge Z units, surrounded by a spherical ball of negative charge, having a total charge â€” (Z â€” 1) units, corre- sponding to the remaining electrons of the atom. Such a field has a potential â– > where Z(r) goes from 1 at large r to Z at THE HYDROGEN ATOM AND THE CENTRAL FIELD 419 small r. For such a potential, most of our discussion goes through without alteration. The differential equation can be separated in the same way, and the functions of angle are just the same, so that our classification into series, vector model, and selection principles holds as with hydrogen. The only difference comes in the function of r, and in the values of the energy levels. We can no longer solve the equation exactly, and shall use the qualitative method of discussion. In Fig. 71 we show a diagram, like Fig. 66, m which we plot â€” â€” â€” + â€” â€” i â€” -â€¢ The potential is so chosen that for r greater than unity, Z(r) is just unity, but for smaller r's Z(r) = 10 â€” 9r, so that the charge approaches 10 at r = 0, but joins on smoothly at r = 1. It is obvious that the s electrons are greatly affected by the change in potential. The Is wave function is located practically all inside r = 1. Thus â– ii -2(10 - 9r) 2(10) , _, nx its potential curve is practically = â€” h 2(9) for the whole range. In other words, it is like a hydrogen prob- lem of nuclear charge 10 units, but with the constant correction 2(9) to be added to the energy. The energy of such a state would be *- (10) 2 = â€”100, and when we add our constant 18, it is â€”82 units, showing that this level is very tightly bound. Similarly the 2s is largely inside, though not so completely, and to a some- what poorer approximation its energy is â€” j â€” (- 18 = â€”7 units. The higher s orbits, however, project out into the region beyond r = 1, where the potential is hydrogen-like with charge 1, and we shall discuss them in a moment. The p, d, . . . states, on the other hand, are almost entirely outside the range where the potential is not hydrogen-like. Their energy levels and wave functions are almost exactly like those of hydrogen. It is seen from this discussion that we can divide the levels in such a case into three classes: (1) those entirely inside the range of large potential, which will prove to be those inside the atom; (2) those half in and half out; and (3) those entirely outside. The levels of larger I values do not penetrate the inside, and belong to group 3. In this case, we reach this situation with I â€” 1, but with larger cores of negative charge about the nucleus, and so larger regions where the potential is much greater numerically than in hydrogen, the p electrons, or in some cases the d or even / electrons, are penetrating. For the lowest I values, in any 420 INTRODUCTION TO THEORETICAL PHYSICS case, the orbits of large n are partly outside but penetrate inside, and those of small n are entirely inside the core of negative charge. These penetrating orbits have quite different energy values from the nonpenetrating ones, so that the different series do not lie -60- -l >- Fig. 71. â€” Potential and energy levels for a central field, with Z(r) = 10 â€” 9r from r = to 1, Z(r) = 1 for r greater than unity. Left-hand diagram on different energy scale. on top of each other, as in hydrogen. For the orbits which penetrate in their inner parts only, we get a formula for the energy, from the quantum* condition. This formula is most conveniently derived using Bohr's form of the azimuthal quan- tum condition. We have fp r dr = n r h for the radial quantum condition. Then for hydrogen, I l p r dr -\- kh\= nh = â€” 7 â€” - h, V-E THE HYDROGEN ATOM AND THE CENTRAL FIELD 421 where k is Bohr's azimuthal quantum number. Thus $p r dr = , â– â€” kh. For a penetrating orbit with our form of potential, the integral over the outer part of the orbit, where the potential, and hence p r , are hydrogen-like, will have just the same value as here, if we use the proper energy. For the inside, however, p r is much greater, so that there is an additional contribution to the integral, as we see from Fig. 72. This contribution, moreover, is roughly the same for all terms of the same k value, since the Fig. 72. â€” Phase space and phase integral for r, penetrating and nonpenetrat- ing orbits. (1) and (2): Nonpenetrating orbits of same k, different n. (3) combined with (2) : Penetrating orbit, having same energy as (2) , but in a non- Coulomb field, so that it has a different quantum number and phase integral. Shaded area represents the quantum defect 5. inner part of the orbit depends almost entirely on the angular momentum alone. Thus we have for the general case s p r dr = h V-E â€” kh + Sih, where 5i is a function of k only, to the first approximation. The result must be n r h, by the radial quantum condition, so that we have E = - (n r + k - 5i) 2 (n - 5i) 2 (10} 422 INTRODUCTION TO THEORETICAL PHYSICS where n is the total quantum number, and where 5 is called the quantum defect. A more careful discussion, using the Wentzel- Kramers-Brillouin method, shows that the same formula still holds when we use -\/l(l + 1) in place of k, and remember that we must use half quantum numbers. This formula, which can be written, in wave numbers, E = â€”-, r~v5' is called Ryd- ' (n â€” 5i) 2 berg's formula, and was first discovered experimentally by Ryd- berg. We see then that the penetrating orbits fall into series as the nonpenetrating ones do, but that we must subtract the quantum defect from the quantum numbers. These quantum defects range from for the nonpenetrating orbits to sometimes quite large values, even of the order of 5 or 6, for the s electrons of heavy atoms. From experimental observations of spectral series, we can find the quantum defects, and so tell which orbits are penetrating, and which are not. In the next chapter we shall discuss in more detail the energy levels for the orbits entirely inside the atom, which are most directly concerned in atomic structure. The wave functions for the central field of the type we are discussing are not very different in general from those for hydro- gen. But there are important differences in detail. We note that a hydrogen-like orbit corresponding to the problem of nuclear charge Z is 1/Z times as great as that for nuclear charge 1. Hence, in the case of Fig. 71, the Is and 2s orbits are some- thing like 1/10 as large as for hydrogen. The penetrating orbits, like 3s, 4s, etc., will have the inner loops small in proportion, as the Is and 2s are, but the outer parts, being in a field of charge 1, will be large. Thus there will be a much greater disparity between the size of the inner and outer loops than even for hydro- gen, the outer ones being much more important in consequence. We may see this from the Wentzel-Kramers-Brillouin method. Here both amplitude and wave length go inversely with p r . In the penetrating part of the orbit, p r is much greater than for hydrogen, for the same total energy, so that amplitude and wave length become extremely small. The physical way to say this is that the electron moves very fast when it penetrates the core and is exposed to the whole charge of the nucleus, and hence spends but a very short time there, so that the wave function is small. For actually computing the wave functions, we can best use numerical integration of the differential equation, or the THE HYDROGEN ATOM AND THE CENTRAL FIELD 423 method of Wentzel, Kramers, and Brillouin. We shall discuss wave functions more in detail in the next chapter. Problems 1. Work out the spherical harmonics for I = 3, and draw diagrams for them similar to Fig. 69. 2. Prove from the differential equation that the associated spherical harmonics are orthogonal. Verify this for the cases of I = 1 and 2. 3. Carry out the solution of the radial wave function for hydrogen, deriv- ing Eqs. (5), (6), and (7), following the method outlined in the text, and verifying that if the series does not break off it represents a function which becomes infinite as r approaches infinity. 4. Show that y 2 dr, where y = rR, is proportional to the probability of finding the electron between r and r + dr. Compute radial wave functions for states Is, 2s, 3s for hydrogen, and draw graphs of y 2 . 5. Prove that for a radial wave function without nodes (I = n â€” 1), for nuclear charge Z, the maximum of y comes at n 2 /Z. 6. Using the results of Prob. 3, Chap. IX, set up the radial phase integral for Bohr's model of hydrogen, showing that E = â€”1/n 2 . Using the prop- erties of the ellipse mentioned in Prob. 4, Chap. VII, verify the statements of Sec. 242 regarding the dimensions of the orbits. 7. Draw an energy level diagram in which the substates of different m's are shown, drawing them as if slightly separated, including states Is, 2s, 3s, 2p, 3p, 3d. Indicate all transitions allowed by the selection principles for I and m, as in Fig. 70. 8. Prove that the potential used in Fig. 71 is what would be found with a nucleus of 10 units charge, surrounded at distance unity by a hollow sphere, with 9 units of negative charge uniformly distributed over the surface. 9. A rough model of the inner electrons of the sodium atom can be obtained by assuming the nucleus of charge 11 units; a shell of radius 0.09 units, with two electronic charges spread over the surface; and a shell of radius 0.58 units, with 8 electrons spread over it, so that the net charge is 1 unit positive. Set up a diagram like Fig. 71 for such a potential field, drawing the potential functions for s, p, d electrons. Find which orbits are nonpenetrating. 10. Using the potential of Fig. 71, and Bohr's azimuthal quantum condi- tion, compute the positions of 3s, 4s, and 5s levels. To do this, evaluate the radial quantum integral, computing separately the parts inside and outside r = 1, set the sum equal to n r h, and solve for the energy, using numerical methods if necessary to solve the transcendental equation. Find how closely the result fits with the Rydberg formula, computing quantum defects for each level. 11. In the field of Fig. 71, the p electrons do not have exactly the hydrogen energies, for their wave function is not zero in the region inside r = 1, where the potential is not hydrogen-Uke. Compute the first-order perturbed value of the energies of 2p, 3y>, 4p, by using hydrogen wave functions as the starting point of a perturbation calculation, and assuming the difference between the hydrogen potential and the actual one as perturbative potential. 424 INTRODUCTION TO THEORETICAL PHYSICS Compute quantum defects for each level, seeing how well the Rydberg formula is obeyed. It is to be noted that in such a case as this, the second- order perturbation is often more important than the first, so that our calcula- tion is not very accurate. 12. Apply the Wentzel-Kramers-Brillouin method to the wave functions of hydrogen, computing approximate radial functions for 3p, 4p, and com- paring with the exact solutions. CHAPTER XXA.1V ATOMIC STRUCTURE The electrons in an atom move, to an approximation, in central fields of force, each in the field produced by the nucleus and the average charge of the other electrons. Thus, as we have seen in the last chapter, there are different quantum numbers which they can have. We can have in an atom Is, 2s, 2p, . . . elec- trons. All electrons of a given total quantum number, inside the atom, have roughly the same radius for the maximum of their wave functions, and roughly the same energy, in contrast to the electrons which are largely outside, in which s and p electrons are more tightly bound on account of penetration. We can then group the electrons of the same total quantum number together into shells, those of n = 1 forming what is called the K shell, those with n = 2 the L shell, n = 3 the M shell, etc., the letters K,L,M, . . . coming from x-ray notation. The inner electrons are the most tightly bound and hardest to remove, and hence connected with the highest frequencies in the spectrum: the K series of x-rays, connected with the electrons of the K shell, has shortest wave length, L series next, and so on. On the other hand, an outer electron is shielded from the nuclear attraction by the presence of the other electrons; for the electrical force acting on a charge in a spherical distribution is what we should have if we imagined a sphere drawn about the center through the charge in question, forgot about all charge outside this sphere, imagined the charge inside the sphere concentrated at the center, and calculated its attraction by the inverse square law. Thus lor an inner electron we forget about almost all the other elec- trons and have practically the unadulterated attraction of the nucleus, but with an outer electron the number of other electrons within the sphere is almost equal to the nuclear charge and almost cancels it, leaving only a small net attraction, and an easily detached electron. It is convenient in this connection to speak of an "effective nuclear charge" Z e , and a shielding constant S; ?>~ is the charge which, placed at the center, would produce the 425 426 INTRODUCTION TO THEORETICAL PHYSICS same attraction as the nucleus and electrons, and thus varies from Z for the inner electrons down to the order of magnitude of 1 for the outer ones, and S is denned by Z e = Z - S, so that S measures roughly the number of electrons inside the sphere in question. In general, we see that each electron in an atom, or at least each shell, will have a different shielding constant. And now it is an important fact that the energies involved in ordinary chemical and physical processes are only large enough to remove or disturb the outer electrons of an atom, and leave the inner ones unaffected. Only x-rays, very violent bombardment, and such extreme means can disturb the inner electrons, and as a result we need not consider them in ordinary chemical and physi- cal applications. 246. The Periodic Table.â€” The series K,L, M, . . . of shells has no obvious end, and yet an atom has but a finite number of electrons. It is evident, then, that the shells cannot all be filled. The attraction of the nucleus will pull electrons into the lowest shells, until they are filled, and then the rest will have to go into higher ones. The capacity of a shell is strictly limited, according to a very important principle called the exclusion principle (excluding more than a certain number of electrons from a shell), so that a K shell can contain only 2 electrons, an L shell 8, an M shell 18, an N shell 32, and so on. Using this principle, we can begin to see how the atoms build up, and in so doing we under- stand the structure of the periodic table (see Fig. 73), the fact that when atoms are tabulated according to atomic number their properties repeat themselves in a regular way. Thus hydrogen has but one electron, which naturally prefers to go into the K shell. Of course, it does not have to; it can be in a higher shell, or level, corresponding to a higher energy, and then it is an excited electron. But this is not a stable situation: collision with another atom or molecule, or interaction with radiation, is most likely to absorb the extra energy and permit the atom to fall to its lowest and most stable energy level, losing its excitation, so that this lowest level is the normal state. This situation of the existence of excited states, but the preference for the normal state, is characteristic of all the atoms, and for the moment we are describing the normal states, for they are the ones in which we ordinarily find the atoms. To resume, helium has two electrons, and in the normal state they are both in the K shell. This shell is now completed, no ATOMIC STRUCTURE 427 more electrons can be bound in it, and such a completed shell is characteristic of the inert gases, of which helium is one. Lithium, with three elctrons, would have two K electrons, and one L, and the latter would be loosely bound, and could be easily detached. In connection with this, we observe that lithium is an alkali metal, very much inclined to form a singly charged positive ion, which it does by losing the one electron, the loss of unit negative Cs 5556 - Ra Ce Pr Nd-SmEuGdTbDyHoErTuYbCb 57 58 59 60 61 62 63 64 65 66 67 686970?! Ac ThRaU 909192 HfTaW-OsIrPt 7273-74757677 78 AuHfl TIPbBiPb 7990 81 82 836485 Q. Q.CL Q. O. W W "llO "w "Â« gg 6s,4f.5eJ,6p 7s.5f,6d w w to 8 S W W Fig. 73. â€” Periodic table of the elements, with electron configuration of lowest states. charge being the same as gaining unit positive charge. Next, beryllium with four electrons has two K'& and two L's, and can easily lose the latter to form a divalent positive ion. Thus we go through boron with two K's and three L's, carbon with four L's (forming sometimes the ion with four positive charges) and nitrogen with five L's. By this time, however, the attractions between the outer electrons and the nucleus have become rather large, and they are not easy to detach. The reason for this is that as we get more electrons in a shell, the effective nuclear 428 INTRODUCTION TO THEORETICAL PHYSICS charge gets larger. For the electrons in a shell cannot shield each other very effectively; off hand we cannot say whether they are inside or outside the sphere of the last paragraph, and as a matter of fact the contribution to the shielding constant made by an electron in the same shell we are considering is only about 0.35 of an electronic unit. Thus if the effective nuclear charge for lithium's L electron were 1.30 (which is about the right amount, equal to Z â€” S where Z = 3, S = 1.70 for the two K electrons, which do not shield perfectly), then for one of the two L electrons in beryllium we should have 4.00 â€” 1.70 â€” 0.35 = 1.95, and for an L electron in boron 5.00 - 1.70 - 0.70 = 2.60, increas- ing 0.65 for each atom, until for nitrogen we have 3.90 and for oxygen 4.55. Since the electrostatic attractions are proportional to the nuclear charge, this means that it is much harder to remove an electron from nitrogen than from lithium. By the time we come to oxygen and fluorine, we hardly have positive ions formed at all. But now another situation comes in : the attractions become so strong that an atom can pull an extra electron or two into its outer shell, forming a negative ion. Thus oxygen very easily forms a singly charged negative ion, and sometimes a doubly charged one. It can not go farther than this, for with two extra electrons its L shell has eight electrons and is completed. Simi- larly, fluorine can form a singly charged negative ion, but no more. And finally neon, with ten electrons, has two K's, eight Us, and consists of closed shells. It is the next inert gas after helium. It forms no ions : it would have to hold an extra electron in the M shell, and this would not be tightly bound, so that it would not stay; or to form a positive ion, it would have to lose one of its L electrons, and these are held too tightly to be removed by ordinary chemical processes. Thus it is inert. After neon, we next come to sodium, with eleven electrons. This has two K's, eight L's, and the next electron must be an M . That is, it has one loosely bound electron, just like lithium. It again has a tendency to form a singly charged positive ion, and is an alkali metal like lithium. Magnesium, next, has two M electrons, and is like beryllium. We begin here to see the origin of the periodic table, for we have advanced by eight in our series of elements and have come to elements of similar properties. The similarity persists in this way up through argon, with eigh- teen electrons. A that point, we must take account of a further fact which we have not mentioned. Each of these shells is ATOMIC STRUCTURE 429 really subdivided into subshells, of slightly different size and energy. The subshells are determined by the azimuthal quantum numbers, the states s, p, d, . . . of the same total quantum number becoming less tightly bound as we go out in the series, on account of decreased penetration. The maximum number cf electrons in a shell of a given designation is invariable : an s group can have only 2 electrons, a p group 6, a d group 10, an / group 14, and so on (2 X 1, 2 X 3, 2 X 5, 2 X 7, â€¢ â€¢ â€¢ , or in general 2 X the number of subgroups of different m values). Now the K shell contains only the s group, accounting, therefore, for its maximum number 2 of electrons. The L shell contains a 2s and a 2p group, so that its maximum number is 2 + 6 = 8. Simi- larly the M has subshells 3s, 3p, 3d, with a maximum number 2 + 6 + 10 = 18, and N has 4s, 4p, Ad, 4/, with a possibility of 2 + 6 + 10 + 14 = 32 electrons. When now we examine the energies of these various groups, we discover that the differences of energy between different subgroups of a shell may often be larger than those between different shells, with a result that the order of groups is changed. As a matter of fact, beginning with the most tightly bound shells, the groups are arranged as far as their energy is concerned approximately as shown in the following table, in which the first line gives the group, the second the num- ber of electrons in the group, the third the total number of elec- trons in that group and all inside it, and the last the element completing the group, whose atomic number therefore stands just above it: Is, 2s, 2p, 3s, 3p, 4s, 3d, 4p, 5s, U, 5p, Gs, 4/, M, 6p, 7s 2 . 2 6 2 6 2 10 6 2 10 6 2 14 10 6 2 2 4 10 12 18 20 30 36 33 48 54 58 70 80 C6 88 He Be Ne Mg A Ca Zn Kr Sr Cd Xe Ba Yb Hg Rn Ra Within each shell the subshells are arranged in the order stated, but there is overlapping between the shells. We now see that at A (argon), although the M shell is not' com- pleted, still the 3p subshell is, and this is enough to form a closed group and an inert gas. Next we come to K, 19, with one 4s electron, another alkali, and Ca, 20, with two, an alkaline earth like Be and Mg. But now instead of forming a group of 8 by adding p electrons, the next additions go 'nto the 3d shell, and only after that is filled up do they go into 4p, so that by the time we come to the next inert gas, Kr, we have added 18 electrons rather than 8 after A. The series of elements in which the 2d 430 INTRODUCTION TO THEORETICAL PHYSICS electrons are being added is the iron group. These have con- siderable similarity, because although the 3d electrons are less tightly bound than the 4s, they are farther inside the atom, and the outside parts of these atoms are quite similar. When we go beyond Kr, we repeat the same sort of process, having another group of 18 elements in which the 5s, 4d, and 5p electrons are being added, before coming to the next inert gas Xe. The transition group which we go through here is the Pd group. Next after that, after adding the two 6s electrons to form Ba, the whole group of 14 4/ electrons is added, resulting in a long group of remarkably similar elements, the rare earths. As a matter of fact, these elements have one 5d electron each, so that our scheme is a little misleading in respect to them. After finishing the 4/ group, the normal procedure repeats itself, the bd and Qp being added to complete the shell of 18 interrupted by the rare earths and terminated at Rn, and finally the 7-quantum electrons being added to give the elements of the last, incom- pleted row of the table. It is often convenient, in describing an atom in any state, to give the number of electrons having each quantum number by a symbol, as ls 2 2s 2 2p 6 3s for the normal state of Na, meaning that there are two Is, two 2s, six 2p, one 3s electron. Such an arrange- ment is called a configuration. And a transition between two stationary states can be conveniently denoted by writing the two configurations. Thus the transition ls 2 2s 2 2pHp â€” > ls 2 2s 2 2p 6 3s for Na is a line of the principal series in the optical spectrum; the transition Na ls 2 2s 2 2p 6 3s â€” > Na+ ls2s 2 2p 6 3s represents the proc- ess of ionizing one of the K electrons of Na; and so on. 247. The Method of Self -consistent Fields. â€” We have just seen that the electrons of an atom act approximately as if they moved in central fields, rather than under the action of the other electrons, and have shown that this leads to quantum numbers for the electrons, to shells resulting from this, and to the periodic properties of the elements as successive shells are filled up. In making this idea more precise, we meet the method of self-con- sistent fields, developed by Hartree. In this method we assume that 1. The field in which the kth electron moves is obtained by taking the wave function of each of the other electrons, squaring to get the average density of charge due to these electrons, averag- ing over angles to get a spherically symmetrical distribution, ATOMIC STRUCTURE 431 adding all these charge densities together, and finding the poten- tial, together with that of the nucleus, by electrostatics. This, of course, will give a nonhydrogenic field, different for each electron. 2. To get the wave function of the fcth electron, we solve Schrodinger's equation for the field above, using the appropriate quantum numbers. Since the field is nonhydrogenic, we must use numerical methods, or the Wentzel-Kramers-Brillouin method. Having found these final wave functions, they must be the same ones with which we started step 1. It is this fact which leads to the name " self-consistent." If we started with arbitrary wave functions, computed a field, solved for the wave functions in that field, the final functions would not in general agree with the original ones. If we keep on repeating the process, however, using in each case the final wave functions of one stage of the calculation to begin the next, it rapidly converges so that after a few repetitions the field is approximately self-consistent. This method has been used for numerical computation of the wave functions of a number of atoms. 248. Effective Nuclear Charges. â€” The method of self- consistent fields, though quite accurate, demands numerical computation, and is not well suited for elementary calculations. We may instead approximate the wave function of each electron by a hydrogen wave function, corresponding to an effective nuclear charge Z â€” Si. To get Si, we should add up the total number of electrons within a sphere whose radius is the effective radius of the ith. electron's wave function. It is easier to figure, not by means of the radius, but from the quantum number, since to a rough approximation the radius of an orbit is n?/{Z â€” Si), so that electrons inside a given one are those of smaller total quantum number. The following table proves to give roughly the contribution to the shielding constant of a given electron from each other type of electron, valid for the electrons found in the light atoms. We see that the shielding of one electron by a second does not go suddenly from unity to zero as the shielding electron's quantum number becomes greater than that of the shielded electron, but instead changes gradually, in accordance with the fact that each electron really has charge distributed over all distances, and it is possible for part of the charge to be inside, part outside, a given radius. 432 INTRODUCTION TO THEORETICAL PHYSICS Table 1. â€” Contribution of One Shielding Electron, of Given Quantum Number, to Shielding Constant of Shielded Electron Shielding electron Shielded electron Is 2s 2p 3s 3p Is 0.35 2s 0.85 35 0.35 2p 0.85 0.35 0.35 3s 1.00 0.85 0.85 0.35 0.35 3p i.oo 0.85 0.85 0.35 0.35 To illustrate the use of this table, let us take the case of Na, Z = 11, in its normal state ls 2 2s 2 2p 6 3s. Evidently we have three shells, corresponding to the three values of n. Then we have n = 1: 8 = 0.35, radius = n 2 /(Z - S) = 1/10.65 = 0.09 n = 2: S = 2(0.85) + 7(0.35) = 4.15, radius = 4/6.85 = 0.58 n = 3: S = 2 + 8(0.85) = 8.80, radius = 9/2.20 = 4.09. The inner radii are as given in Prob. 9, Chap. XXXIII. The calculations we have given so far refer to wave functions, rather than energy levels. To investigate the latter, we must make a more careful discussion of the theory of the many-body problem and its treatment by Schrodinger's equation. 249. The Many-body Problem in Wave Mechanics. â€” Our treatment of atomic structure so far has been rather intuitive, not based directly on Schrodinger's equation at all. We have not yet set up the problem of many bodies in wave mechanics. To do so, we proceed as follows: Let the problem have N gen- eralized coordinates, q\ . . , q N . Then we seek a wave function >K<?i . . . qN, t), such that ^dq\ . . . dq N gives the probability that the coordinates will be found at time t in the region dq x . . . dq N . To set up Schrodinger's equation, we take the classical Hamiltonian function, convert it into an operator H by sub- stituting -pâ€”. â€” for p iy and write the equation H\p = â€”ttâ€”. -r-- 2ti dqi 2ti at We eliminate time as usual, and have a differential equation for u{qi . . . q N ), which is Hu = Eu, E being the energy of the whole system. There is one simple case of the many-body problem: that where there are many particles, exerting no forces on each other. ATOMIC STRUCTURE 433 That is, we may have n particles, whose coordinates are x x y\Z\ . . . x n y n z n , and the potential is 7= Vi(xiyiZi) + â€¢ â€¢ â€¢ + V n (x n y n Zn), without any terms involving coordinates of two parti- dV dV- cles simultaneously. For such a potential, â€” = -^(xiyiZi), OXi OXi a force on the ith particle depending only on the coordinates of that particle. In such a case, we can separate variables, writing u = Ui(xit/iZi) â€¢ â€¢ â€¢ u n (x n y n Zn). For Schrodinger's equa- tion can be written [(- afe*' + v ) + â– â– â– + (- s^" 2 + v -)} = Eu, (1) where V; 2 means d 2 /dx t 2 + d 2 /dyt 2 -f- d 2 /dZ{ 2 . A separation of variables can be carried through in the usual way, and can be summarized as follows: if we write u as a product, as above, then Schrodinger's equation is satisfied if (-&fe V <* + V ) Ui = EiUi 'tÂ»i) (2) Ex + â€¢ â€¢ â€¢ + E n = E In the case of atomic structure, and in general with the struc- ture of matter, there are forces between the electrons. But here it is possible to make an* approximation, as we have done: we replace the actual force between a given electron, say the ith, and the others, by the average which it would have from the mean distributions of the other electrons in space. Roughly we may say that, while the force with any particular arrangement of the other electrons will differ from this value, it will average out to give our mean value, and the deviations from the mean will not be so large as to destroy the approximation. Thus, using such a method, each electron becomes acted on, not by the other electrons, but by an averaged field. It is the motion in this field that we have considered in the present chapter. 250. Schrodinger's Equation and Effective Nuclear Charges. â€” The result of the approximate calculation we have made has been a set of one-electron wave functions, one for each electron of the atom. These satisfy equations which, in atomic units, are ^, 2(Z - Si) Vt (Z - Si) 2 Ui = â€” j-^-Ui. (3) 434 INTRODUCTION TO THEORETICAL PHYSICS Now the potential energy of the whole atom, in atomic units, is all pairs if m is the distance between the ith and ith electrons. Thus the Hamiltonian is *-2(-"-*+2Â£+2Â£\ (Â« i \ j inside i j in same / \ shell as i / where the two summations are the same thing as the sum over all pairs. If now we assume that u = u x â€¢ â€¢ â€¢ u n , where the u'a are as we have found, and try to see how good an approximation this forms, we have, substituting for the Laplacians, from Eq. (3), *--[-2^V2(-* + 2 Ti ^ rj + > f-k (6) j inside i j in same â€¢ shell as i If Schrodinger's equation were satisfied, this would be Eu, where E is a constant. This is not true; the first term is a constant times u, but the second is a variable function of the r's times u. The average value of the last term, however, is approximately zero. For 2/r*,- is the potential, at the *'th electron, of the jth electron. If the latter is inside, and we average over its position, and average to make it spherically symmetrical, the potential will be the same as if it were concentrated at the center, or will be 2/n. For an electron in the same shell, it turns out that the average of l/rÂ»,- is about 2(0.35) /rÂ». The summation, for all electrons inside or in the same shell as i, is then essentially 2Si/r i} just canceling the first term, and leaving as the result, using this approximate method Â«of averaging, of i showing that we have an approximate solution, and that the energy of the atom is - ^ ^â€” This represents the nega- ATOMIC STRUCTURE 435 tive of the energy required to remove all the electrons from the atom. If we wish to find the energy of the atom by first-order perturbation theory, we recall that we must find the diagonal term of the energy matrix, or juHu dv. This means averaging the energy over the wave function, or over the motions of the electrons; and to the same approximation we have just used, the summations average to zero, leaving the same energy we just found. As an example of the calculation of energy, we can again take the case of Na. The energy of normal Na is, using the 2( â€”\ â€” ) + 8( â€” ~ ) + = â€”321.4 units. With one Is electron removed, making the appropriate changes in shielding constants, the energy to -|7!yÂ»Y + 8( 7 4Â°Y + (mX] = -240.6unit, Thedif- [(^y + s(-y +(*#â€¢)] ference is 80.8 units, or 1,094 volt-electrons, representing the ionization potential. Similarly with the 2s removed, the energy K^M^y + m}= is -^2\^â€”J + 7^^ ) + I ^p I J = -318.6, leaving an ionization potential of 2.8 units, or about 38 volt-electrons. Finally the ionization potential of the 3s, as we immediately (Â¥*)' - â€¢ see, is simply I ~^â€” J = 0.54 unit = 7.3 volt-electrons. 251. Ionization Potentials and One-electron , Energies. â€” In the method of self-consistent fields, each electronic wave function is the solution of a central field problem, for a single electron. This one-electron problem has a certain energy, as found in the preceding chapter, always negative, very large numerically if the electron is tightly bound, smaller if it is more loosely bound) and it is natural to ask for the interpretation of this energy. The connection with tightness of binding suggests directly that the one-electron energies measure the work required to remove the electron in question, or the ionization potential, the negative energies being the negative of the ionization potentials. This proves in fact to be the case. One can compute these ionization potentials, by finding the energies of the atom and ion and subtracting, and the result proves to be, to the first order of perturbation, just the one-electron energy. Thus the K \ 436 INTRODUCTION TO THEORETICAL PHYSICS ionization potential is given by the distance of the Is energy level below zero in the corresponding one-electron problem, and so on. The connection is not very accurate, but it is close enough to be very useful. Our method of effective nuclear charges, being an approxima- tion to the method of self-consistent fields, should show the same property, and we can give a simple though not entirely satis- factory proof. ' The negative of the ionization potential is the energy of the atom, minus the energy of the ion. If the ith electron is to be removed, and if S,- represents a shielding con- stant in the atom, S/ for the ion, then the energy of the atom, minus the energy of the ion, is 3 39*1 If we set S/ = Sj â€” (Sj â€” S/), and expand, this is (Z - Stf tii 2 ^ (z - Si)> + 2(z - Sj)(s, - s/) + (Sj - s/y - (z - s^y ^J Uj 2 39* i Our simple proof holds only in case there is no other electron in the same shell as the ith, and if we assume that each electron shields by either 1 or 0. Then we have S, â€” S/ = if the jih electron is inside the *th, 1 if the jth. is outside the ith. Thus for the ionization energy we have (Z - S^ 2 | ^g 2(Z - Sj + j) (g) j outside i In this case we can easily find the energy of the one-electron problem. The potential energy of the field in which the larger part of the ith wave function is located is -*Â£=M+ y, i ( 9) j outside i where Si represents the number of electrons inside the ith, and the summation is for all outer electrons, assuming constant mean radii, which are approximated l/r } - = (Z. â€” $,-)/n,- 2 . To verify the correctness of this potential^ we note that the corresponding force, â€” [2(Z â€” Si)]/r 2 , is what we should have for the charge ATOMIC STRUCTURE 437 inside the sphere concentrated at the nucleus, and the constant terms of the summation are added to make the potential con- tinuous at the outer shells. Thus the wave equation for Ui is -V, 2 - Â«^+.2r'-*]"-* j outeids i (Z - S*) nc which gives immediately 1 outside i or, using the value of l/r,- f j outside i agreeing with the value (8) already found, except that the correc- tion term *^ in (Z - Â£,- + }4) is missing. This formula, more- over, is interesting, in that it shows that the shielding has two effects on the energy: (1) The energy has the term â€” (Z â€” S?)/nÂ£ instead of â€” Z 2 /n 2 , as we should have with an elec- tron in the unshielded field of the nucleus. This effect, reducing the magnitude of the ionization potential, is called the inner shield- ing, since it comes from the inner electrons. (2) There is also the summation over the outer electrons, likewise resulting in a reduction of ionization potential, and called the outer shielding. As we see from our derivation, the outer shielding results from the rearrangement of shielding constants of the outer electrons when an inner electron is removed. Problems 1. The K series in the x-ray spectra comes when a K electron is knocked out, and an L, M , . . . electron falls into the vacant place in the K shell. The lines are K a (if an L electron falls in), Kp (an M), etc. Write down the configurations before and after the K a and Kp transitions of Mo. 2. Show that the frequencies of the lines of the K series are less than the frequency of light necessary to cause ionization of the K electron. Compute the K ionization potential and the K a line for Ca, and show that they fit in with the general case. 3. Moseley's law is that the square roots of x-ray term values (ionization potentials) form a linear function of the atomic number. This would obvi- ously be true if there were just inner shielding, for then the square root would be simply (Z â€” S)/n. Investigate how closely this is true when there 438 INTRODUCTION TO THEORETICAL PHYSICS is outer shielding as well, computing K and L term values for electrons from Z = 10 to Z = 20, and seeing how closely the square roots fall on straight lines. 4. Iso-electronic sequences are sets of ions, all of the same number of electrons, but with different nuclear charges, and hence different degrees of ionization. Compute the ionization potentials, or term values, ls 2 2s 2 2p â€” > ls 2 2s 2 , ls 2 2s 2 3s â€” > ls 2 2s 2 , for the atoms Z = 5 to 10, indicating what ions they are (as Z. = 6, ls 2 2s 2 2p is C + ). Investigate to see whether these term values follow Moseley's law that the square root of the term value is a linear function of atomic number. 5. Using the approximation that the radius of a shell is n 2 /(Z â€” S), draw curves giving the radius of each shell as function of Z for all atoms up to Z = 20 (compute only enough values to draw the curves). 6. In a closed shell of p electrons, there are two electrons of m = 1, two of m = 0, two of m = â€”1. Using the spherical harmonics for these cases, compute the squares of the wave functions, treating these as electron densi- ties. Add the densities of all electrons, showing that the sum is independent of angle, or that the p shell is spherically symmetrical. The same thing is also true of any completed shell. 7. Given a spherical distribution of charge, where the potential is 2Z P /r, and the force 2Z//r 2 , where Z p , Z/ are both functions of r, prove that Zf = Z p - r(dZ p /dr). * 8. Assuming that the electrons are located on the surfaces of spheres of radius n 2 /(Z â€” S), find and plot Z/ and Z v for Na + as functions of r. CHAPTER XXXV INTERATOMIC FORCES AND MOLECULAR STRUCTURE Atoms by themselves have only a few interesting properties: their spectra, their dielectric and magnetic properties, hardly any others. It is when they come into combination with each other that problems of real physical and chemical interest arise. Atoms act <on each other with forces, in some cases attractive and in others repulsive, and in this chapter we shall consider the nature of these forces, how they arise, and what their results are in their effect on the physical and chemical structure of the sub- stance. Interatomic forces in the first place hold atoms together to form molecules ; this forms the province of chemistry. But in turri they hold molecules together into their various states of aggregation, as solids, liquids, and gases, and this is ordinarily considered to be part of physics. The distinction, however, is purely arbitrary, and not at all general. We shall begin by dis- cussing the most important types of force, with a little considera- tion of the types of substances in which they are found. All the interatomic forces of interest in the structure of matter are electrical or in some cases magnetic; the only other forces, gravi- tational, are far too small to be of significance. We arrange the different types according to the way they depend on the distance of separation of the atoms. 252. Ionic Forces. â€” If two atoms are ionized, they attract or repel according to the inverse square. If the net charge on one is Z\ units, on the other z 2 , the potential energy between them is 2i2 2 e 2 /r, if r is the distance between. 253. Polarization Force. â€” Atoms are polarizable, as we have seen in discussing refractive index, in Sec. 172, Chap. XXIV. That is, an atom in an electric field E acquires an electric moment olE. Now suppose that we have an atom or an ion in the pres^- ence of another ion. The ion produces a field ze/r 2 . This in turn polarizes the first atom or ion, producing a moment aze/r 2 . The resulting dipole reacts back on the ion, attracting it with a force equal to the field of the dipole (equal to the moment of the " â€¢ 439 440 INTRODUCTION TO THEORETICAL PHYSICS dipole times 2/r 3 ) times the charge on the ion, or -(2az 2 e 2 /r 5 ). The potential of this force is â€” (az 2 e 2 /2r i ), giving always an attraction. 254. Van der Waals' Forceâ€” Ionic and polarization forces are met only with ions. The forces observable at largest dis- tances between neutral atoms or molecules, and hence of impor- tance in the behavior of liquids and imperfect gases, are called Van der Waals' jgprces, on account of their appearance in Van der Waals' equation of state for imperfect gases. They arise as follows. An atom is generally spherically symmetrical and thus on the average has no externa] electric field. But this is only on the average; instantaneously it is not spherical, but the elec- trons are at arbitrary positions, and the result gives a dipole moment, averaging to zero, but instantaneously different from zero. This dipole polarizes a second atom or molecule. Thus the field of a dipole of moment n is jic/r 3 X function of angle. In the two special cases where the dipole points straight toward, or away from, the atom, the function of angle has the values + 2, respectively. In that case, the induced dipole in the second molecule is + (2a/z/r 8 ). This produces a field back on the first, equal to Â± (4a^/r 6 ). The force by which it acts on the original dipole is equal to the rate of change of the field with r, times the dipole moment, times a function of angle which is Â± 1 in the two cases considered, or ( +-^\Â±ii) - -â€” ^-> with potential 4cm 2 energy g â€” If we had considered all angles, we should have got a different constant, but in any case an attraction, â€”con- stant X a/j, 2 /r 6 . To calculate the polarization and Van der Waals' forces, we should have to find a and /*. The calculations for these are difficult and will not be attempted here, though, a derivation will be given in a later chapter. For the present, however, we can get some semiempirical formulas which will serve for rough calculations. First, the polarizability a has the dimensions of a volume. An argument from a simple model in Sec. 172 showed that, at least in order of magnitude, the polarizability of a spheri- cal atom is equal to the cube of its radius. Now the radius of an electron's orbit can be approximated by n 2 a /(Z â€” S), so that we might imagine that the polarizability of an atom could be approximated by the sum of such terms, cubed, for all elec- INTERATOMIC FORCES AND MOLECULAR STRUCTURE 441 trons. Empirically, one finds that this gives about the right dependence on Z, but not very accurately for n: the contribution of an electron to the polarizability proves to be approximately / n 2 a V \Z-Sj (4.5 if n = 1 X <1.1 if n = 2 (0.65 if n = 3, etc. The total polarizability is the sum of such contributions, for all electrons. As we readily see, only the electrons in the outer shell make an appreciable contribution, since they have the largest values of n and the smallest Z's. Hence we may simply multiply the number v of electrons in this shell by the term above. Thus v = 2 for an ion with the same structure as the He atom, 8 for one built like Ne, 8 for one like A, etc. For the Van der Waals' force, we expect an energy â€”con- stant X an 2 /r 6 . We shall consider the problem more in detail in Chap. XLII, Sec. 301, where it is shown that the energy is 3 1 2 â€¢ ~2 ^ ' and where in addition we have the relation , a&E 11 ~2~' In this formula, AE is the difference of energy of that transition from the normal state which contributes most to the refractive index and dispersion. Ordinarily this can be taken to be the same as the ionization potential of the atom. Thus, since we know how to find ionization potentials from our effective nuclear charges, we may use empirical or approximately calculated polariza- bilities to get coefficients for the Van der Waals' attraction. The three types of force we have enumerated all fall off as inverse powers of the distance. If we inquire further, we find that there is a whole series of terms, in higher and higher inverse powers of r. Thus between ions we have a series commencing with terms in 1/r and 1/r 4 , between atoms commencing in 1/r 6 , but having higher terms arising from interaction of the induced dipoles of both atoms with each other, interaction of dipoles and charges with quadrupole moments, etc. The complete series would be difficult to evaluate. In addition to these forces, there are other quite different ones, coming when the atoms are so close that their charge distributions actually begin to overlap. 442 INTRODUCTION TO THEORETICAL PHYSICS Since these distributions fall off exponentially with distance, as in hydrogen functions, these types of force all fall off exponenti- ally and for that reason cannot be expanded in inverse powers of r at all (the exponential function possesses a singularity at infinity and so cannot be expanded in power series in 1/r). The forces are sometimes grouped together, but we prefer to break them up into three classes. 255. Penetration or Coulomb Force. â€” As one atom penetrates another, there will be forces on account of pure electrostatics, even if the two atoms do not distort each other. Let the outer shell of each atom penetrate within that of the other (Fig. 74a). Then the part of each which penetrates the other finds itself in a field attracting it toward the nucleus of the other, since it is no (o) (b) Fig. 74. â€” Penetration of one atom by another. Circles represent shells of electrons, (a) Attraction. Negative charge of each atom penetrates within the outer shell of the other, being attracted to the positive nucleus, (b) Repul- sion. Nucleus of each atom penetrates the outer shell of the other, the repulsion of the nuclei for each other outbalancing the attractions. longer shielded by all shells of the other. The result is an attrac- tion of the charge of each for the other, pulling the whole atoms together. On the other hand, as the atoms get still closer, the whole system of inner shells of one would get inside the outer shell of the other (see Fig. 746). These inner cores are both positively charged on the whole and will repel, a repulsion more than enough to counteract the attraction, in general. Hence at sufficiently close distance, the penetration force will be repulsive. In between, there will be some distance at which the force will be zero and there will be equilibrium. 256. Valence Attraction. â€” The penetration force acts even though the atoms are not distorted. The force of attraction principally concerned in valence, however, is an additional force resulting from the distortion of one atom by the other. The distortion produced by ordinary electrostatics is at least approxi- mately taken care of by computing the polarization, as stated in INTERATOMIC FORCES AND MOLECULAR STRUCTURE 443 Sec. 253, but there is an additional effect, resulting from the operation of the exclusion principle, and the existence of electron spins, and which leads to a tendency for electrons to form stable pairs, agreeing with the ideas of G. N. Lewis regarding homopolar valence, or valence attraction between uncharged atoms. To understand this, even approximately, we must look more closely into the exclusion principle. In addition to their charge, elec- trons also act like little magnets, having a north and south pole. This is as if the charge were to rotate, forming a little electric current around a circle, and corresponding magnetic lines of force. The result is called electron spin. Now when we have a pair of electrons, it turns out that their spins can be oriented in just two possible ways: either parallel to each other, or opposite or antiparallel. If they are parallel, then the exclusion principle comes in and says they cannot be in the same shell. But if they are opposite the principle does not operate. It is a result of this that the allowed numbers of electrons in the various groups in an atom are all even numbers. Thus, in the s shell, after we have one electron, we can add a second if its spin is opposite to the first, for then the exclusion principle does not act. But if we now try to add a third, its spin must be parallel to one of the two already there, and the exclusion forbids it. Similarly a p group really contains three different subgroups, each of which can contain but two electrons, with opposite spins. Analogous results hold for the other groups. We see, then, that the sub- group of two electrons with opposite spins is a configuration which electrons like to form, and that only two electrons can enter such a configuration, so that there is a tendency toward pairing. But now it appears that such a pair can be formed by two electrons in different atoms, just as well as by two in the same atom. Thus if each of two atoms has just one electron, rather than two, in one of its subgroups, and if these two electrons have opposite spin, they can form a pair held in common by the two atoms, actually localized in the space between the atoms, and tending simply by electrostatic forces to hold the atoms together, the attractions of this negative concentration of charge for the nuclei, which must have a net amount of positive charge, more than counterbalancing the repulsions between like charges at large distances, though at smaller distances the force becomes repulsive, on account of the ordinary penetration effect. This is the origin of homopolar valence. We see that every electron 444 INTRODUCTION TO THEORETICAL PHYSICS lacking from a closed shell can be interpreted as giving the possi- bility of forming a valence bond, so that for example the halogens have a single valence, oxygen and sulphur have two, hydrogen has one (one electron missing from Is), and so on. 257. Atomic Repulsions. â€” If one brings two atoms close enough together, they will always repel and resist further approach. This is what we know physically as the impenetrability of matter. It is a result of the exclusion principle, again. If we force two atoms so close together that the shells of the two atoms overlap, and if these shells are all filled with electrons, then we are really trying to force more electrons into the same region of space than the exclusion principle allows. What, hap- pens is that the electrons then move outside of this region, the atoms become distorted, and the resulting increase of energy is interpreted as a force of repulsion between the atoms. These actions commence as soon as closed shells begin to overlap appre- ciably, and as a result the atoms have rather sharp boundaries, and for some purposes may be considered as having definite sizes. We should notice that, if the outer shells of the atoms are not closed, this repulsion can be altered. Thus, if two lithium atoms approach, each having a closed K shell but only one elec- tron in its 2s shell, either of two things can happen. If the two L "electrons happen to have parallel spin, then the exclusion principle operates between them, and they will repel each other, as if they had only closed shells. But if the spins are opposite, then the outer shells can coalesce, forming a shared electron pair, and resulting in attraction. Even in such a case, however, we finally meet repulsion as we bring the atoms together. In the first place, at close enough separation, the K shells would begin to overlap, and since they are closed shells they would repel in the usual way. But also the pure electrostatic interaction gives repulsion at small enough distances. For with more and more penetration, we get to the point where the nuclei are close together, in the midst of a combined set of shells of electrons from both atoms. Increasing closeness will then increase the repulsion' between the nuclei, without much changing anything else, and this repulsion will finally become great enough to cancel all other effects. 258. Analytical Formulas for Valence and Repulsive Forces. â€” The three types of force which we have just been discussing, Coulomb penetration force, valence attraction, and repulsion, INTERATOMIC FORCES AND MOLECULAR STRUCTURE 445 all depend on the actual overlapping of the charge distributions of two atoms. Here again we can find a simple approximate formula, which is yet accurate enough to be decidedly useful. Since the charge distribution falls off in general exponentially with the distance, we may assume that the potential energy also falls off exponentially: energy = Ce~ ar , where r is the distance between nuclei. The constant C is negative for attractions, posi- tive for repulsions. The value of a, of course, will be different with each type of force, and each type of atom. Nevertheless, we can give extremely rough rules which yet suffice to give the order of magnitude of a. First we set up, for each of our two atoms, the "radius" of the outer shell, n 2 /(Z â€” S). We add these radii for the two, multiply by 1 if the electrons in the outer shells are p electrons, as in closed shells, but by 1.4 if they are s electrons in both atoms, as in a molecule made of two alkali atoms. Let the result be r . Then as far as order of magnitude is concerned, the energy is a constant times e~ i{r/ro) for the pure repulsion between closed shells. In the valence attraction case, where the curve has a minimum, we can combine the valence and Coulomb forces, since both behave about the same. Then the result is approximately â€¢ (J e -6(r/r ) _ C"g-3(r/ro) / the first term representing the repulsion close in, the second the attraction farther out. The constants as we have written them are for the normal state of the atoms and molecules, and in this case it is found that the equilibrium distance for the valence attraction comes approximately at r . This results, as we readily verify, by writing the formula in the form D [ e -<^) _ ar'fe-O} or more generally D e -2a(r-r ) _ 2De -nB(r - r Â° ) , (1) where a is a constant, which we have set approximately equal to 3/r . This form of potential curve has been used by Morse, and he has tabulated values of D, a, and r for a number of molecules, in excited as well as normal states. The constant coefficient D, or the corresponding coefficient in the pure repulsive energy, is not easily given in a general way. We can easily see its significance, however. In Fig. 75 we plot a Morse potential jgurve, observing that it has a minimum at r , ^DEPARTMENT OF CHEMISTRY HVHMHH: COLLEGE OF TECHNOLOGY 446 INTRODUCTION TO THEORETICAL PHYSICS the energy at this point being â€” D, while at infinite separation the energy is zero. Thus D represents the energy required to pull the atoms apart to infinity if they are initially at rest at the equilibrium distance, or, in other words, the energy of dissocia- tion of the pair of atoms. These energies, for actual molecules, vary between a fraction of a volt-electron and several volt- electrons, depending on the tightness of binding of the molecule. A few simple rules help in estimating D, as for instance that the larger r , the smaller D tends to be (for example, F 2 is more tightly bound than J 2 , the F atom being smaller than /); molecules with a double or triple valence bond have larger D's than with single bonds; etc. The repulsive energy between closed shells, which we have approximated by Ce~ i(r/ro) , is generally associated with an ionic or Van der Waals' attrac- tion, resulting again in a mini- mum. This minimum, however, is ordinarily at much larger distances than r , more nearly 2r p or even larger. This is in consequence of two things : the attractive forces are rather weaker than the valence attrac- tions, and second the repulsion between closed shells is naturally larger, and effective at larger distances, than the repulsion found in valence compounds. In an actual case, where we know the Van der Waals' or ionic force, we can then make an estimate of the distance of separation at the minimum, and find C from the condition that the correct total potential has zero slope at this point. To get a number comparable with those met in valence attraction, we should write the repulsion in the form De ~ 4 Vâ„¢ ~ * ) . Then in actual cases D comes out of the order of a few volt-electrons. Often one finds the repulsive forces of which we have just spoken approximated by an inverse power of r, as b/r n , where n Fig. 75. â€” Morse potential curve, X) e -2a(r-r ) - 2De' a ( r_r o) . INTERATOMIC FORCES AND MOLECULAR STRUCTURE 447 proves to be about 8 or 9. We immediately see that both func- tions, exponential and inverse power, behave similarly, being large for small r, small for large r, so that either form can be used, though, since the repulsion depends on penetration, which actually goes off exponentially, we can be sure that the inverse power term is not so accurate. We can readily find out why n has about the value 9. The repulsive term is of importance, and can be found experimentally, and n determined, near the minimum of the energy curve. For Van der Waals' or ionic forces, as we have mentioned, this proves to come at about 2r . Then suppose that we choose b and n so that b/r n has the same value and slope as Ce~ 4(r/ro) when r = 2r . We have = Ce -*(??) (2r )Â» and nb 4 _ 4 (2iA (2r ) n+1 r from which, dividing one by the other, 2r /n = r /4, n = 8, approximately as is found experimentally. Many discussions, particularly of the structure of ionic crystals, are based on this inverse power formula, which has been used by Born and others. 259. Types of Substances: Valence Compounds. â€” Now that we have investigated the types of interatomic forces, we should consider them with reference to the different types of substances in which they occur. Broadly speaking, there are two main types of substances, corresponding to the two principal kinds of interatomic attractions, the ionic and the valence forces. Let us arrange our valence compounds roughly in order of melting or boiling points, starting with the most volatile, and ending with the most stable. The first substances on the list are not compounds at all, and indicate valence only in a sort of negative way: they are the inert gases, He, Ne, A, Kr, Xe. Since the outer shells of these are already completed, they form no ions, and they have no electrons to be shared and have no possibility of valence forces, and form no compounds. Next we come to a group of diatomic molecules, for example H 2 , 2 , N 2 , F 2 , Cl 2 , Br 2 , CO, HC1, HBr, etc. These are held together by valence forces (HC1 and HBr are somewhat ambiguous, and might be considered to be ionic compounds; this ambiguity is met in almost all H compounds). For example, each atom in H 2 has one electron; 448 INTRODUCTION TO THEORETICAL PHYSICS they share these, making a pair. In O2, each atom has six L electrons; but they share two pairs (a double bond). As we go on, we come next to fairly simple polyatomic molecules. We have water, ammonia, methane: H 2 0, NH 3 , CH 4 , all rather plainly valence compounds (though the ambiguity of which we spoke previously makes an ionic interpretation possible as well), with each hydrogen held by a single valence bond to the other atom. We might well include with these the ammonium ion, NH 4 + , presumably built like methane. Other simple ones are C0 2 , CS 2 , with double bonds. Then we certainly should include some of the simple organic compounds, as acetylene C 2 H 2 (triple bond between the carbons), ethylene C 2 H 4 (double bond between the carbons), ethane C 2 H 6 (single bond). All these molecules of which we have spoken are held together by valence forces. On the other hand, there are also Van der Waals' forces between molecules, though of a smaller order of magnitude than the valence forces, and these hold the substances together in liquids and solids, all of low boiling points, but of increasing stability as the molecules become heavier and more complicated. The very considerable difference in order of magnitude between the valence and the Van der Waals' forces is significant, for this brings it about that the separate molecules preserve their identity, even when crowded close together. More complicated organic compounds naturally come next in the list. They still preserve to some extent the property of existing as separate molecules, in gas, liquid, and solid, so that they still have both valence forces between atoms, and Van der Waals' forces between molecules. But as the molecules get more and more complicated, the Van der Waals' forces get larger and larger proportionally, so that with the fairly compli- cated ones they are of the same order of magnitude as the valence forces. Many comphjcated organic compounds dissociate when heated, rather than going through a change of state, since the heat necessary to melt and boil the substances becomes more and more nearly equal to that required to break up the molecules. It becomes, in other words, harder and harder to distinguish separate molecules, the solid acting more and more like a single big molecule. The silicates form a group of compounds slightly suggesting the organic compounds in their complexity. They contain the group Si0 4 -4 , which can be best described as a pure valence INTERATOMIC FORCES AND MOLECULAR STRUCTURE 449 compound, Si(0 -1 )4, held together just like methane, Si being analogous to C. In many compounds the silicate groups are joined together, by sharing oxygens, as in the double group Si 2 7 -6 , or (0 -1 ) 3 Si-0-Si(0 -1 )3, a neutral O atom being joined by its two bonds to the two Si atoms. This process of sharing oxygens may continue, until finally there is a network formed through the whole crystal, the metallic ions, as Ca++, etc., merely fitting into empty space in the network, and all traces of molecu- lar structure being lost. Thus these crystals are held together by forces so strong that they are not easily broken up. They are insoluble and refractory, and in fact form a great proportion of all the minerals. 260. Metals. â€” The metals form a type of substance more or less by themselves, but in general resembling valence compounds. There is a definite indication, at least in some of them, that there is a network of valence forces between the atoms, running through the metal, and holding it together to form a solid. At the same time, the simple Coulomb penetration force seems to account for a considerable part of the cohesion of metals. The network of valences seems to be connected with the electrical conductivity: an electron shared between two atoms can go to either one, and if the sharing exists through the solid, the electrons can migrate and carry a current. For many purposes, it is more correct in a metal to give up the idea that an electron is attached to a given atom at all, and treat them as free to move from one place to another, like the molecules of a perfect gas. The typical metallic states are solid and liquid. When a metal is vaporized, the tendency toward molecular formation does not seem to be strong. The vapors of such metals as have been examined show both monatomic and diatomic molecules; one wonders if polyatomic ones would not also be found if the experiment were made, acting simply like little pieces of the large metallic crystal. 261. Ionic Compounds. â€” The ionic compounds are not so easy to classify in a definite order as valence compounds, principally because they are more alike. The primary fact about ionic compounds is that they are held together by electrostatic forces, the atoms appearing in the ionized state. The forces between the atoms depend only on the distance, and are independent of the presence of other atoms (except in the matter of polarization). The laws governing the formation of ionic crystals are simple electrostatic ones, such as that positive and negative ions tend 450 INTRODUCTION TO THEORETICAL PHYSICS to approach as closely as possible, ions of the same sign go as far apart as possible, charges in small volumes tend to equalize them- selves, and so on. As a particular result of these, there is no tendency to form molecules. It is almost impossible to build up out of ions any structure which would not have large electrostatic fields around it; and further ions would be attracted by these fields, so that the substance can build up indefinitely. Further, the electrostatic fields are rather large, compared with the valence forces. The physical nature of these substances follows from the principles very easily. Their most characteristic form is the solid, where they form crystals in which the ions are arranged on a regular lattice. There is no trace of molecular structure in the lattice. They are hard and stable, often harder than metals, and of high melting point, although, of course, there is large variation from one compound to another. The vapor phase is an unimportant one for practically all ionic substances. Much more interesting in general than either liquid or vapor is the ionic state in water solution. Water, on account of its great dielectric constant, decreases all electrostatic forces. It thus almost removes the forces holding such a crystal together, and the solid breaks up into ions dissolved in the water. When we ask about individual ionic compounds, we can well classify according to the ions from which they are made. The fundamental building stones are in every case ions of atoms; and the ions are of two sorts, positive and negative. The metals practically always form positive ions. They easily lose their valence electrons, as we have seen, so that all the electrons outside closed shells are removed, giving the alkali ions a charge 1, alka- line earths 2, the aluminum group 3, and so on. As we go through the series of elements, we see that even the nonmetals sometimes form positive ions, as CI with seven positive charges. Sometimes, however, their ions are negative, though about the only important atoms forming negative ions are O and S, forming singly and doubly charged ions, and the halides F~, Cl~, Br" I~. These atoms add electrons to make a closed shell, instead of losing them. It is obvious why there are so few: adding electrons makes an atom negatively charged, so that it tends to repel other electrons. It is a process which cannot go on far. The negative halide ions generally exist by themselves. The oxide ion also exists by itself in oxides; but it also forms complex negative ions, with positive nonmetallic ones, which are the most important negative ions INTERATOMIC FORCES AND MOLECULAR STRUCTURE 451 known. There are two alternative explanations of these radicals, either as pure ionic compounds, or as a combination of this with valence forces. For example, the sulphate ion can be regarded as being formed from a completely stripped sulphur ion and doubly charged oxygens: S0 4 -2 = S +6 (0 _2 ) 4 . But if we assume that the oxygens have only single negative charges, we have the other possible structure S +2 (0 _1 ) 4 . With this structure, the sulphur has four electrons, as carbon does, and so has four homo- polar valence bonds; and the oxygens have the same electron structure as halogens, with a single valence bond. Thus the sulphur can be bound to the four oxygens by valence bonds, assisting the electrostatic attraction, and the structure would have similarity to methane or carbon tetrachloride. This latter explanation seems to be nearer the truth, since it can be calculated that the work required to form the completely stripped positive ion in the ionic model would be much greater than the work necessary to form the other structure. Problems 1. Find the potential energy between two helium atoms, using our approxi- mate methods for calculating Van der Waals' and repulsive forces, and com- pare with the more accurate value \ 7.7e- 2 " 3 VÂ«o _._l^*f 10-10 ergs, where a = 0.53 X 10 -8 cm. The polarizability of helium is 1.43a 3 , and its ionization potential 1.80 Rh. Compare these with simple calculated values. 2. Using the potential of Prob. 1, compute the equilibrium distance of separation between two helium atoms, and find the energy of dissociation, in ergs, and volt-electrons. Compare the equilibrium distance with the mean distance in the liquid, which has a density of 0.14, assuming atoms to be spaced on a regular lattice, so that the mean distance will be l/^n, if n is the number of atoms per cubic centimeter. 3. Find a radius of the helium atom for use in kinetic theory, assuming that two helium atoms at temperature 300Â° abs., with kinetic energy of %kT, collide head on. Find how close they come before they stop, and compare this molecular diameter with the distance r . 4. Two energy levels of H2 coincide with the lowest energy of the atoms at infinite separation, one an attractive level (corresponding to valence binding, with the spins of the two electrons opposed), and one repulsive (the spins being parallel, so that the exclusion principle operates). Plot the energies of both terms as functions of distance, deriving the exponents according to our approximate laws, and determining the scale from the fact that the energy of dissociation of the molecule is about 4.3 electron volts, 452 INTRODUCTION TO THEORETICAL PHYSICS and that the energy of the repulsive term at the distance of molecular equilibrium is about 8 electron volts. 5. Compute by our approximate laws the distance of separation r of the atoms in the normal states of the valence compounds given below, and compare with the experimental values tabulated : Compound r (Angstroms) Compound r (Angstroms) c 2 CN CO H 2 1.31 1.17 1.15 0.76 I 2 NO o 2 SiN 2.66 1.15 1.21 1.57 6. Compute by our approximation polarizabilities for the following ions, and compare with the experimental values tabulated : Ion a X 10 24 Ion . a X 10 24 oâ€” 1.60 sâ€” 5.91 F" 0.868 ci- 3.33 Ne 0.398 A 1.67 Na+ 0.292 K+ 1.12 Mg ++ 0.173 Ca++ 0.785 7. Compare the distance of separation of atoms in the metallic crystals n 2 tabulated below with the sum of the quantities z -s for the two atoms. Metal Distance, Angstroms Na K Ca 3.72 4.50 4.97 8. Compute the interatomic potential energy for NaCl at large distance, assuming it is composed of Na + and CI - , so that there will be the ionic force, and at the same time a polarization force, the sodium polarizing the chlorine. Show that the polarization of sodium by chlorine can be neglected. Using the polarizabilities of Prob. 6, show that the potential energy is -, t-. â€” r-. electron volts. |_ r/oo (r/a ) 4 J 9. The observed interatomic distance in the NaCl molecule is 2.73 Ang- stroms. Compute the constants C and a in the repulsive potential Ce~ ar , INTERATOMIC FORCES AND MOLECULAR STRUCTURE 453 Find a by the rules we have used, and determine C so that the sum of the repulsive potential, and the attractive potential of Prob. 8, will have a minimum at the required distance. 10. Using the value of a found in the preceding problem, find the equiva- lent value of n in the repulsive potential b/r n for the NaCl problem, seeing how nearly it equals 9. CHAPTER XXXVI EQUATION OF STATE OF GASES In the preceding chapter, we have considered interatomic forces, and their effect in determining the nature of substances. When we begin to think more precisely of what we mean by the nature of substances, we conclude that the equation of state, and the closely related specific heat, are among the most important properties. We shall, therefore, take them up, giving necessarily enough thermodynamics and statistical mechanics to make calculations possible. Our investigations will be concerned with the thermal motion of the nuclei, moving under the interatomic potential which we have investigated. We shall naturally not be able to treat all sorts of substances; liquids, for instance, are so complicated that comparatively little progress has yet been made in understanding their properties. But gases and crystal- line solids both present features of simplification which we can make use of. 262. Gases, Liquids, and Solids. â€” Before passing to our analysis, let us consider what types of behavior we wish to explain. We can conveniently divide our discussion into gases, liquids, and solids. A monatomic gas, as an inert gas, is the simplest case: we have only to find its pressure) and total energy, as a function of volume and temperature, a task which can be carried out when we know the law of force between molecules. Gases of valence compounds, however, are more complicated. Their equation of state is not much harder to approximate than with monatomic gases, at least at low density, for on account of the rotation of the molecules they act on the average as if they were spherically symmetrical, and we need use only the intermolecular force averaged over angles in deriving the equation of state. In the specific heat, however, there are two forms of energy to con- sider: the translational kinetic energy of the molecules as a whole, which acts just as in monatomic gases, but also the rotational and vibrational energy of the individual molecules. This involves 454 EQUATION OF STATE OF GASES 455 a different sort of calculation. A still further complication appears in gases of some ionic substances, and of some valence compounds like I 2 and NO. Here there are several types of molecule which can be simultaneously present in the gas, as 21 +Â± I 2 , 2Na ^ Na 2 , and a proper treatment of the equation of state and specific heat would demand investigation of the equi- librium concentrations of the constituents, and their change with pressure and temperature. A liquid is more complicated than a gas, in that the molecules are so closely in contact that they can no longer be treated as points. The liquids of the inert gases are, of course, exceptions, and there are a few other exceptions, diatomic and polyatomic substances whose molecules rotate even in the liquid, and so act like spherical systems. But with most liquids the molecules are bulky enough so that they do not rotate, and are definitely non- spherical in their average behavior. In considering the equations of state, in particular the compressibility, one can no longer, as with a gas, neglect the change of volume of the molecules with change of pressure. As the molecules become larger and larger, as with complicated organic compounds, the distinction between forces within and forces between molecules becomes lost, and the whole liquid must be treated as a single complex, the volume being determined more and more definitely by the space required to pack the atoms together. The state of close-packing of atoms which we have just men- tioned is definitely reached with solids. In fact, with noncrystal- line solids, there is no sharp distinction between the states, as glass for instance shows, solidifying perfectly continuously from the liquid. The solids with definite melting points are the crystals, which *have a definite lattice arrangement of the atoms which is not met in the liquid. This regularity of arrangement is the simplifying feature which makes it possible to treat crystals theoretically. We can here commence our discussion with the state at absolute zero of temperature, where the atoms are at rest, and the whole crystal is in a position of equilibrium of the interatomic forces. The compressibility of such crystals can be fairly easily found from the forces, and this has been carried through particularly successfully for some of the ionic crystals. Then we can treat the crystal in thermal agitation by investigat- ing the small oscillations of the atoms about their positions of equilibrium, using the method of normal coordinates. This 456 INTRODUCTION TO THEORETICAL PHYSICS makes it possible to consider both equation of state and specific heat with fair ease and generality. Out of all the group of topics which we have suggested, there are a few which can be treated theoretically fairly successfully. First, there is the equation of state of rare gases, or of polyatomic gases whose molecules rotate so as to be spherically symmetrical on the average. This is what we take up in the present chapter. Secondly, there is the specific heat of rotation and vibration of molecules. Thirdly, one can consider the equilibrium between different types of molecules in a gas, the question of chemical equilibrium. Fourthly, the equation of state and specific heat of crystalline solids can be investigated. As a preliminary to these, we must extend our treatment of statistical mechanics, which we have already considered slightly in Chap. XXX. We first follow out the ideas of classical statistics a little further, treat the equation of state of a gas by those methods, and then go to quantum statistics, asking what changes are introduced. 263. The Canonical Ensemble. â€” Following Chap. XXX, we consider a phase space; that is, a space in which each coordinate and each momentum of the system is plotted as a variable. Let the coordinates be q x . . . q n , the momenta p x . . . p n , the Hamiltonian function H(q x . . . pâ€ž). Then the phase space has 2n dimensions, and a point in this space represents a whole system (for instance a sample of gas). Next we set up an ensemble of points in this space, the number in the volume elemental . . . dp n being proportional to f(q x . . . p n )dq x . . . dp n . We assume all points of the ensemble to be equally likely; that is, we assume that the probability that the coordinates and momenta of the system actually lie in the region dq x . . . dp n is proportional to the number of points of the ensemble in this* region, or is proportional to f dq x . . . dp n . Then to find the average of any function of the coordinates and momenta, as F(q x . . . p n ), we multiply by /, integrate, and divide by the integral of /: F = r// gl ' ' 'j Pn > as we saw in Chap. XXXI. ifdqi... dp n Now in particular we set up the canonical ensemble, ff(gl . â– . Pn) f{qi â€¢ ' â– Pn) = constant e kT , where T is the absolute temperature. This ensemble gives the probability that a system in thermal equilibrium at temperature EQUATION OF STATE OF GASES 457 T will have its coordinates and momenta within given limits. The essential physical reason for this is the following: Suppose we have two systems, the first of coordinates and momenta. ?i . . . ffÂ», Pi . . . Vn, the second q n+ i . . . q m , p^ . . . p m , with the separate Hamiltonian functions Hi(q! . . . p n ), H 2 (q n +i . . . p m ). Then physically we know that, if 1 and 2 are at the same temperature, and are then allowed to interact slightly, as by interchanging energy, it will be found that they are already in equilibrium with each other, and they already form a combined system in equilibrium at this temperature. This, in fact, is the definition of equality of temperature. But this is satisfied for the canonical ensemble. Thus if the separate systems are in equilibrium, their distribution functions are Hi(qi . . . pâ€ž) fi(qi " ' ' p n ) = constant e kT Hi(qâ€ž + i . . . p m ) / 2 (<7â€ž + i - â€¢ â€¢ p m ) = constant e kT By the laws of probability, then, the probability that simultane- ously the coordinates q x . . . p n will be in the range dqi . . . dp n , and that q n+1 . . ; . p m will be in dq n+x . . . dp m , is proportional to (ffi+ffiO the product of these probabilities, or constant e kT dq x . . . dp n dq n+ i . . . dp m . But now suppose that the two systems are allowed to interact. The combined system will have an energy Hi + H 2 + H', where H' is a small interaction potential, depending perhaps on all coordinates and momenta, negligibly small compared with the separate energies (as for instance an interatomic force between the negligibly small number of molecules on the boundary between the two systems, which permits the flow of heat between them). Then according to the canonical ensemble, the distribution of the whole combined {Hi + Hz + H') system in thermal equilibrium should be constant e kT But we observe that, except for the negligible energy H', this is just the distribution before the interaction, so that the two systems were already in equilibrium before the interaction, and by definition are at the same temperature. This result Is true only with the canonical ensemble, since it depends on the expo- nential form, adding exponents being equivalent to multiplying the functions. Suppose we choose the constant in the definition of the canoni- cal ensemble so that // dq x â€¢ â€¢ â€¢ dpâ€ž = 1, and avoid having to 453 INTRODUCTION TO THEORETICAL PHYSICS bother with the denominator in taking averages; this corresponds to normalizing a wave function. Further, let us write the con- JL e kT stant in the form -râ€” > so that we have h n f(qi â€¢â€¢â– ?Â») = F-H e kT h n and r 1 f lnH f(Qi â€¢ â€¢ â€¢ Vn) dqi â– â– â– â– dp n = 1 = ^J e kT dq x â€¢ â€¢ â€¢ dp n . (1) Here F is a quantity of the dimensions of energy, a function of T, chosen to make the constant have the correct value. Since e F/kr j s dimensionless, and since the function / must have the dimensions of I /{dqx . . . dp n ), in order to make its integral dimen- sionless, we must multiply by a constant of these dimensions. We have chosen l/h n , which has the correct dimensions, since h is of the dimensions of pq. It is a purely arbitrary matter that we have chosen this particular constant/since in all ordinary physical applications the constant drops out anyway, and it does not imply the introduction of quantum theory into classical questions. We shall later see, however, that it simplifies the comparison with quantum theory to have it there. 264. The Free Energy. â€” Let us take the factor e F/kT out of the integral above (it does not depend on the q's and p's), and divide througn by it. Then we have e hT = F n I e kTdqi ' ' ' dpn ' (2) The integral on the right is often called the integral of state (we shall later see cases where it degenerates to a sum, called the sum of state). It is fundamental in thermodynamic applications. The quantity F is the free energy, and we proceed to investigate its properties. We have seen that it depends on the tempera- ture; but we must also observe that it depends on the volume. To see how this comes about, let us think about the Hamiltonian function H, in particular for a gas. We are considering only the nuclear motion, so that H includes the kinetic energy of the nuclei, and the potential enersy of the interatomic forces, as EQUATION OF STATE OF GASES 459 discussed in the last chapter. But it also includes another term, if the gas is in an enclosure: the repulsion of the wall. The molecules of the gas, as they strike the wall, are repelled, so violently that they never penetrate the wall. We may say approximately that the potential energy becomes rapidly infinite as any molecule approaches the wall, and is infinite if any molecule is outside it, so that e~ (H/kT) is zero in that case, and there is no probability of finding one of the molecules outside. Now this term in the potential depends on the volume of the vessel, the rapid rise of potential coming at the edge of the volume, which is adjustable. Thus we have H(q x . . . p n , v), where v is the volume, so that the free energy, which depends on an integral of this quantity, also depends on v as well as T. Let us investigate the rates of change of the free energy with respect to volume and temperature. We have or kT 1 J h n J 1 dH kT dv e dF _~d~H dv dv where we remember the formula for finding the average of any quantity. Now consider a cylinder filled with gas, closed with a piston of unit area. If we decrease the volume, the increment of volume being â€” dv, which therefore equals numerically the displacement of the piston, and if the pressure, and therefore the force on the piston is p, we shall do the work â€” pdv on the system. This will represent the increase in energy of the system, or dH. Hence we have dH/dv = â€” p. We may consider this relation as stating that p is the generalized force connected with a generalized coordinate v, and therefore equal to the negative derivative of the energy with respect to this coordinate. Per- forming the average, We then have \dv J, g)--p. O) Next we can differentiate the free energy with respect to temperature. We have 460 INTRODUCTION TO THEORETICAL PHYSICS dT (Â«'")- (W* ~Wrw) e~* T " Uw> e '"^ â– dp - or dF â€” If we define H, the mean energy of all systems of the ensemble, as the internal energy E, we have '-KU).-* (4) the familiar Gibbs-Helmholtz equation. From Eqs. (3) and (4), we have dF = - V dv- ^-jr^dT. (5) Now let us define the entropy S by the equation F = E-TS,S = ^=-?- (6) Differentiating, this leads to dE = dF + TdS + SdT. Simi- larly, Eq. (5) becomes dF = -pdv - Sdt. Combining, we are led to dE = TdS - pdv. (7) Equation (7) is the fundamental equation of thermodynamics, which we have derived from statistical methods. For the first law of thermodynamics is dE = dQ - pdv, where dQ is the heat absorbed in a process, pdv is the work done by the system. And the second law of thermodynamics is that for a reversible change (as our change is, since we assume that the distribution is always given by a canonical ensemble, which means that it is always in equilibrium), the quantity dQ/T is a perfect differential, dS Combining these statements, we have Eq. (7). The specific heat can be found immediately by differentiating the energy with respect to temperature at constant volume. Using Eqs. (4) and (7), it is *-($.-<$.-'<&â– (8) EQUATION OF STATE OF GASES 461 Thus we can find the specific heat, as well as the equation of state, by differentiating the free energy. This makes it a very useful function, and its calculation, by means of the integral of state, is the usual method of deriving information about physical properties of substances. Of course, we could derive the same information from the energy itself as a function of volume and temperature, but it is not quite so convenient to calculate. 265. Properties of Perfect Gases on Classical Theory.â€” Let us apply the method of the free energy to the calculation of the equation of state and specific heat of a perfect gas, on classical mechanics. Let there be N molecules, each of mass m, so that N H = 2i 2^ + v > i = 1 where V, the potential energy, is zero so long as all molecules are within the volume v, but becomes infinite if even one molecule strays outside. Then we have P 2 xl - PzN- fe kT dq x â– â– â– dp N = J*^ e 2mkT dp xl â– â€¢ â€¢ f\ e 2mkT dp iN f je~ h ~ T dx l â€¢ â– â– dz N . (9) Now by direct integration each of the integrals over the p's is simply \/2irmkT. The integral over coordinates is the integral of unity over all regions where the coordinates are inside v, over the outside, so that it is I I I dx x dy\dz\ â€¢ â€¢ J I J dx N dy N dz N â€” V V v N . Thus we have finally e -f T _ (vwryv, (]0) a function of temperature and volume as it should be, and the free energy itself is F = -SNkT In ^' 2 â„¢ kT - NkT In v. h I rem this we have at once p = NkT/v, giving the ordinary law of perfect gases, and C v = %Nk, likewise a well-known result. 462 INTRODUCTION TO THEORETICAL PHYSICS 266. Properties of Imperfect Gases on Classical Theory. â€” Next let us consider an imperfect monatomic gas, such as an inert gas. This differs only in that there is an additional term in the Hamiltonian, a sum of interaction energies of each pair of atoms : H = kinetic energy + ^S Va + repulsion of walls, pairs i,j where Va may be the sum of a Van der Waals attraction between the ith. and jth. atoms at large relative distances, and an expo- nential repulsion at small distance. We then have fffdxxdyidzi â€¢ â€¢ â€¢ JJjdx N dy N dz N e ^-f kT . (11) The integration over the coordinates can be carried out in steps- First we integrate over the coordinates of the Nth molecule. The quantity e kT , can be factored : it is equal to S' _ s ViN e kT e i^N kT where 2' represents all those pairs which do not include the Nth. molecule. The first factor then does not depend on the coordi- nates of the iVth molecule, and may be taken outside the integra- tion over its coordinates, leaving ///Â« X ViN kT dx N dyNdz N . We rewrite this as = v-W, (12) f f fdx N dy N dz N â€” Cfifl â€” e- S-^f ) dx N dy N dz N = v v v ^ ' the first term being simply the volume, the second term being an integral to be evaluated. To investigate W, let us imagine all the molecules except the iVth being in definite positions of space. If the gas is rare, the chances are that they will be well separated from each other. Now if the point x N y N z N is far from any of these molecules, the interatomic potentials V iN will all be small, and EQUATION OF STATE OF GASES 463 the integrand will be practically 1 â€” eÂ° = 0. Thus we have contributions to this integral only from the immediate neighbor- hood of each molecule. If all are alike, each of these contribu- tions will be equal to w = Jjf(} ~~ e~TF) dx N dy N dz N , a quantity which, though it formally involves the index i, actually is independent of i. In fact, if we imagine the ith. molecule to be located at the origin, and remember that Vi N is a function of r, the distance from the origin, we see at once that w = J^Â°Â°4xr 2 (l - e~W) dr, (13) where we integrate to infinity instead of to the boundary of the vessel because the integrand is so small for r's larger than mole- cular dimensions that it makes no difference. In terms of this, we then have W = (N - l)w. (14) Now when we integrate over the coordinates of the (N â€” l)st particle, we have just the same situation over again, except that there are only (N â€” 2) remaining molecules, and so on. Thus finally we have for the integral over coordinates [v - (N - l)w][v - (N - 2)w] â– â€¢ â€¢ v. We can easily evaluate this product, by taking its logarithm, which is what we want anyway. This is N-l N-l N-l V In (y â€” sw) = Vlny+ ^ln(l â€” sw/v). s = s = s=0 The first term is N In v, which we should have for the perfect gas. For the second, we note that on account of the rarity of the gas, sw/v is always small compared with 1. Hence we have In (1 â€” sw/v) = â€” sw/v approximately, and the sum is approxi- mately equal to the integral with respect to s, or J.JV-l _sw o v (N â€” l) 2 w â€” ^â€” To this order, then, neglecting unity in compari- son with N, we have F = -3NkT In y^l - NkT In , + *â„¢* (15) h 2v 464 INTRODUCTION TO THEORETICAL PHYSICS w iU , NkT , N*wkT . tu- â€¢ * + We then have p = -^-5 h â€¢ â€¢ â€¢ . This is often written in the form PL = 1 + ^ + â€¢ , . (16) RT + 2v + '. { } where R = Nk. This expression pv/RT is called the virial, and the coefficients of its expansion in inverse powers of the volume are called the virial coefficients, so that Nw/2 is the second virial coefficient. The results of experiments on imperfect gases are ordinarily given as tables of the virial coefficients as functions of temperature, and by the equation above we can compute the second coefficient, finding w if necessary by numerical integration from V{r). In addition to the pressure, we can, of course, find the specific heat, and it immediately comes out the same as for the perfect gas. We must remember, however, the rotational and vibrational specific heats of the polyatomic gases, which must be added to the translational terms to get the total specific heat. 267. Van der Waals' Equation. â€” There is a limiting case in which we can compute w approximately. This is the case where the attractive part of V(r) varies slowly with r, while the repul- sive part varies so rapidly that it can be considered zero if r is greater than r , infinite if r is less. This is what we should have if the molecules were rigid spheres of diameter r , attracted by the Van der Waals' attraction. If we let V (r) represent the attrac- tion, we have V(r) Voir) kT = e kT if r > ri>, = if r < r . The integral then is w = PÂ°47rr 2 (1 - 0) dr + f Â°Â°4xr 2 [l - e-"j?r]dr. The first term is simply f7ir 3 , the volume of a sphere of radius r , or eight times the volume of the sphere of diameter r which represents a molecule. In the second integral, we may expand the exponential as a power series, since V is relatively small : it is 1 - [1 _ (y /kT) â€¢ â€¢] = Vo/kT. Thus this term is ^ J ^rr 2 V dr + â€¢ â€¢ â€¢ . If, for instance, we have the type of Van der Waalf' EQUATION OF STATE OF GASES 465 force considered in the last chapter, we have V = â€”fi/r 6 , where /? = an 2 . Then the term is â€” (47r/3/3r 3 A;7 T ). In this case, the second virial coefficient becomes Nw #/4 3 \ _ 2Nrfi 2 2\3 7rro / 3r 3 kT the further terms being in higher inverse powers of T. We may write this h A where b is four times the volume of all molecules, A = 2JW/3/3r 3 . Actual gases have second virial coefficients which agree well with this formula. The pressure, in other words, is given by the result ' VL _ i . & d_ 4. . . . (\i\ RT 1+ Â« RTv~* * ^ } being greater than for a perfect gas for large T (the b/v term preponderating), and less for small T. Physically, at high tem- perature, the finite size of the molecules, given by 6, decreases the apparent volume, which produces an increase of pressure; while at lower temperatures the attractions between molecules, given by A, pull the gas together. There is a very well-known equation, Van der Waals' equation, for the pressure of an imperfect gas. / This is (* + Â£)<â€¢ 6) = RT (18) This differs from the equation of state of a perfect gas in two respects : in having the volume (v â€” b) in place of v, as if the mole- cules took up space, and in having the pressure increased by the amount A/v 2 . The arguments used to deduce the equation are not reliable, and it cannot be regarded as more than a very useful empirical formula. But as far as the second virial coefficient is concerned, it is correct. If we compute pv/RT from it, and expand in inverse powers of volume and temperature, we can at once show that the expansion is what we have already found, as far as the term in 1/v, the values of b and A agreeing with those we have already given. The higher terms in the expansion, 466 INTRODUCTION TO THEORETIC AL PHYSICS however, do not agree with what we should get by correct calculation. 268. Quantum Statistics. â€” Distribution functions, and hence canonical ensembles, have a rather different meaning in quantum theory from what they have in classical mechanics. For on account of the uncertainty principle we can no longer specify both coordinates and momenta, and hence cannot give functions of the q's and p's. Instead, as we have seen, we deal with a wave function \p, such that \p\p gives the probability of finding the system at a given point of space. We could set up the corre- sponding quantity in classical statistics: if f(qi . . . p n ) is the ordinary distribution function, normalized so that its integral is unity, then/ . . . jf(qi ... pâ€ž) dpi . . . dp n would give a func- tion of the q's, giving the probability of finding the system with given g's. Thus we should have the correspondence / â€¢ â€¢ â€¢ Jf(3l ' â€¢ ' Pn) dpi â€¢ â€¢ ' dp n ~ ftp, the two quantities agreeing at any rate in the limit of large quan- tum numbers, where classical and quantum theory approach each other. It is not difficult to show that this correspondence holds, at least with one degree of freedom. First, we consider micro- canonical ensembles, ensembles in which all systems have the same energy, but are distributed in phase as if they had started off at all arbitrary instants of time. In such a case, with one degree of freedom, the probability of finding a system in a given range of coordinates is proportional to the length of time a system would stay in that range, or is inversely as its velocity. But now the corresponding quantum ensemble is one in which all systems are in the same stationary state. And using the Wentzel- Kramers-Brillouin method, we have already seen, in Chap. XXIX, that ftp is approximately proportional to 1/y/E â€” V, or inversely as the velocity, so that in this case we actually have the correspondence we desire between classical and quantum theories. The same thing can be shown with more than one degree of freedom. Now any kind of classical ensemble which is independent of time can be made up of microcanonical ensembles; we may regard it as consisting of a certain distribution on each energy surface. The corresponding situation is a quantum state in which all stationary states are excited at once, represented by a wave EQUATION OF STATE OF GASES 467 function Vc^e h . The corresponding density, averaged k over the rapid time fluctuations, is 2jCkCkUkUk, corresponding to k a fraction CkCk of all the systems being in the fcth stationary state, or belonging to the particular microcanonical ensemble having energy E k . Let us see what is the classical ensemble correspond- ing to this combination. We may approximate it in the following way. Let us imagine the energy surfaces corresponding to the stationary states drawn in the classical phase space. Then let a fraction c^k of the systems of the classical ensemble be uni- formly distributed through the region between the kth. and (k + l)st energy surfaces, rather than just on the energy surfaces. We do this to get a continuous function. Then evidently the density of points between the kth and (k + l)st surfaces will be c^k divided by the volume of phase space between these sur- faces. This volume, as we have seen, is h n . Then we have the approximation /(?i ' â€¢ ' Vn) ~-j^ between E k and E k +i. This gives a step-like function for /, which would approach con- tinuity as the stationary states got closer together. Now it is plain how we are to set up a canonical ensemble: we are to set c k c k proportional to e~ Ek/hT , and this will then give the right variation for /. Of course, our correspondence is not exact, but we assume that the quantum canonical ensemble is the exactly correct thing, the classical one the approximation to it. This is justified by the fact that we can give just the same argument for the canonical ensemble's representing thermal equilibrium in the quantum theory that we could in classical theory, and we know quantum theory to be the correct form in cases where it differs from classical theory. Having the canonical ensemble in quantum theory, we can now proceed to the calculation of the free energy and equation of state as we did in classical theory. To get exact correspondence, we should set F-H F-Ek /(ffi â€¢'â€¢?">= if = ")F - IF 468 INTRODUCTION TO THEORETICAL PHYSICS Now the integral / . . . Jfdqi . . . dp n goes over into a sum over all stationary states, multiplied by the volume of phase space associated with each stationary state, or h n . Thus we have â€¢ F-H -E k kT k k and finally e kT = ^ e tr ( 19 ) In the case of degeneracy, where there are several stationary states of the same energy, the sum in Eq. (19) includes a term for each state, so that for an energy level with g states, we have g times the contribution from a single level of the same energy. 269. Quantum Theory of the Perfect Gas. â€” We have already shown the correspondence between the classical and quantum expressions for free energy, to the approximation to which the Wentzel-Kramers-Brillouin method is accurate. This shows us that, for both the perfect and imperfect gases, we may expect to find about the same equation of state and specific heat on both theories. The errors in the method are large only when the wave length is changing very rapidly, and this actually comes, in this problem, only when two molecules are in collision with each other, or are colliding with the walls. Accurate discussion shows that there are appreciable corrections to the classical equation of state introduced in this way for the lightest gases (which there- fore have longest wave length for a given velocity), but even these are small, and difficult to discuss. It is easy, however, to carry through the exact solution of the quantum theory of the perfect gas, and this will suffice to show the general situation. Let the gas be confined in a rectangular volume of sides A, B, C. Then the wave functions for single molecules satisfying the boundary conditions of being zero on the boundaries are sin ^~ sin ^-jjr- sin ~> where p, q, r are integers. A wave function Jo C for the whole gas can be built up from this by multiplying together functions for all the molecules, obtaining EQUATION OF STATE OF GASES 469 . Viirxi . t n tx n u = sm Â£ -^ â€” â€¢ â€¢ â€¢ sin â€” -~ â€” Substituting in Schrodinger's equation, where V is the same potential of repulsion of the walls which we have considered before, we at once have ^8^r-^ + -- + -^> (2o) To get all states, we must take all combinations of the integers pi . . . r N , each going from one to infinity. Thus we have Pi=l r jv =1 h*pi 2 _~ h*-r. = ^ e ~SAhnkT . . . ^ e~8C*mkT. (21) pi=l r w = l Now at reasonably high temperatures, 7 1 is so large that we have to go to large values of the integers p, etc., before the exponential begins to fall off appreciably. Thus the terms of our summation differ only slightly from each other, and we can replace them by an integral, one factor being Thus we have JL (ABCr (v^ty _ ,v(vÂ«rf , where v = ABC is the volume of the gas, agreeing exactly with the classical value, so that equation of state and specific heat are not altered by using the quantum theory. At lower tempera- tures, where we cannot replace the summation by an integration, there will be discrepancies; the gas here is said to be "degenerate." At the same time other features enter the situation, different 470 INTRODUCTION TO THEORETICAL PHYSICS sorts of statistics known as the Bose and Fermi statistics, which we shall discuss later in other connections. We shall not work out the case of degeneracy here, since practically one cannot reach such low temperatures without liquefying the gas, and since we shall meet in the next chapter some corresponding situations in solids, which are actually attained, and are of much more physical interest. Problems 1. For neon, experimentally, 6 = 20.6 c.c. for a mol. Find the equivalent diameter r of the atoms, regarded as rigid spheres. Compare this with the sum of the quantities n 2 /(Z â€” S) for the two atoms. 2. Using our approximate methods of dealing with Van der Waals' attraction, and using the value of r from Prob. 1, compute the constant A for neon. Compare with the experimental value of 0.21 X 10 12 absolute units (you cannot expect very good agreement) . 3. Using the experimental values of b and A for neon, from Probs.l and 2, draw a graph for the second virial coefficient as function of temperature. At what temperature does the graph cross the T axis, and what does this mean physically? 4. Carry out the expansion of Van der Waals' equation in virial form, showing that the second virial coefficient is as we have found, and compxiting the third virial coefficient as well. 5. Using Van der Waals' equation, plot a number of isothermals (lines of constant T, p being plotted against v). Choose both low and high tem- peratures. Use the constants given in Probs. 1 and 2 for neon. Note that at low temperatures the isothermals have a maximum and minimum, while at high temperatures they do not. As is well known, this maximum and minimum are not really present, but the region in which they occur is that in which gas and liquid are in equilibrium and exist as a mixture. 6. The critical point is that point where the maximum and minimum of the isothermals of Van der Waals' equation coincide, or where the first derivative of pressure with respect to volume at constant T has a double root. Compute the critical pressure, temperature, and volume, for neon, using the constants given in Probs. 1 and 2. 7. Hydrogen gas is confined in a container 10 cm. on a side. Find the order of magnitude of the temperature at which it would become degenerate; that is, the temperature at which most of the molecules would be in the lowest quantum state. 8. Compute the internal energy and entropy of a perfect gas by the classical theory. CHAPTER XXXVII NUCLEAR VIBRATIONS IN MOLECULES AND SOLIDS In the last chapter, we have seen that in addition to the equation of state of gases, there was another range of phenomena which we could treat satisfactorily: the phenomena resulting from nuclear vibrations in molecules, leading to the vibrational specific heat, and in solids, leading to the equation of state and specific heat. The mathematical methods used in dealing with them are similar, so that they can be profitably treated together. At the same time, the question of the stationary states of vibrating molecules is of interest in itself, and can be easily taken up. We shall begin with the problem of a crystalline solid; the extension to a molecule, which after all is not very different from a fragment of such a solid, is not hard to make. Our problem is to find pressure as function of temperature and volume, and specific heat. Ordinarily the measurements of the equation of state of a solid take the form of measuring the compressibility and thermal expansion: we express volume as a function of pressure and temperature, and have compressibility = k = â€” Iâ€” 1 > thermal expansion = -( -r^ J â€¢ We shall thus compute these quantities. Now a solid, unlike a gas, behaves in a perfectly normal way at the absolute zero of temperature. Its volume is finite and definitely determined, being given from the equilibrium positions of its own atoms and molecules, which all pack closely enough to be in their equilibrium positions, since they have no kinetic energy. If external pres- sure is applied, the volume will decrease, and we can compute the compressibility. Temperature will not greatly change these quantities: temperature agitation slightly increases the volume, and makes the crystal more compressible, but these effects are 471 472 INTRODUCTION TO THEORETICAL PHYSICS small enough to be treated as perturbations of the state at absolute zero. Hence we begin by considering the crystal without temperature agitation. 270. The Crystal at Absolute Zero. â€” The energy of a crystal at the absolute zero, when its atoms are in perfectly definite positions, is simply the sum of the interaction energies for all pairs of atoms. In the position of equilibrium, this energy must be a minimum with respect to any possible small deformation of the crystal. Thus each separate atom is in equilibrium with respect to a slight displacement, keeping all other atoms fixed, so that it is at a minimum of potential, and could execute vibra- tions about this position of equilibrium, which to a first approxi- mation would be simple harmonic. But there are other sorts of distortion to consider. For instance, we may decrease the whole volume slightly, moving the atoms closer together but preserving their relative arrangement, and the energy must be a minimum with respect to such a distortion. It is this which particularly interests us in computing compressibilities. Now in a very simple crystal lattice, if the volume is decreased, the atoms will still have just the same arrangement. Thus NaCl has a cubic lattice, Na and CI ions being found alternately at the corners of cubes, and squeezing the whole lattice would merely decrease the size of the cubes. The same thing is true of the simpler metals. It is easy to see that it is not always the case; a crystal composed of molecules rather loosely tied together would, under compression, have the molecules forced closer together without much change in the dimensions of each molecule. We do not consider such complicated cases, however, but rather assume that all interatomic distances r are proportional to the dimension 5 of the crystal as a whole. Let us assume, then, a cube of crystal, of side 5, a quantity which depends on the pressure. Let this cube contain N atoms. Now let us assume that the potential energy of the force between two atoms at distance r is the sum of two terms: an attractive term, negative in sign, proportional to 1/r for ionic crystals, or exponential in r for valence crystals; and a repulsive term, positive, and varying exponentially with r. The total energy of the crystal is the sum of all interatomic potential terms. This sum, for the exponential terms, is easy to compute. For these terms fall off so rapidly that practically only the nearest neighbors contribute appreciable terms to the energy. Thus NUCLEAR VIBRATIONS IN MOLECULES AND SOLIDS 473 we simply take each of the N atoms, and sum up the exponential terms to its s nearest neighbors. This, as we readily see, gives each pair counted twice over, so that the sum is \NsAe~ ar , where r is the distance to the nearest neighbor, A and a are constants, or ^NsAe~ aS â€” Ce~ aS i where a = ar/8, since r is proportional to 8. For the inverse power attraction between ions, we cannot confine ourselves to nearest neighbors, since the forces fall off too slowly. Since each term in the energy is proportional to an inverse interatomic distance, and therefore to 1/6, however, the energy will likewise be proportional to 1/5, and the coefficient can be calculated by a proper method of summation over all ions. Having the total energy, it is easy to compute the compressi- bility. We consider the ionic crystal, where the energy has the form -j + Ce- s . (1) We note that dE = â€”pdv, where E is the energy, p the pressure, v = 5 3 is the volume, so that dE dE d8 ( K . â€ž A 1 ,_ N To compute the compressibility, we note that dp = dpd8 = ^K _ C(a 2 + 2a/8)e- aS dv d5 dv ~ 95 7 95 4 from which we get the compressibility by the definition k =* 1 /fat 7â€” Now we are interested in the properties of the solid at v dp zero pressure. Setting the expression above for p equal to zero, a particular value of 8 is determined, giving the volume at zero pressure. In turn, we substitute this into' the compressibility, obtaining K Â